# TERA MTA Principles of Operation

TERA Computer Company 2815 Eastlake Ave East Seattle, WA 98102

November 18, 1997 (Composite Revision: 4.263)

# Preface

This document is constantly evolving.

[[ Details that were being rethought when the document was printed appear in this type style; depending on the context, such a note indicates that additional explanatory text is needed, the design has not been thought out, or the feature being described is deprecated. ]]

# Contents

8

| • | 1 111 | troduction                         |     | 9      |
|---|-------|------------------------------------|-----|--------|
|   | 1.1   | Notation                           |     | 9      |
|   | 1.2   |                                    | •   |        |
|   | 1.3   | Storage Classes                    |     | 9      |
|   |       |                                    | . 1 | 0      |
| 2 | Str   | reams                              | 1   | 2      |
|   | 2.1   | Stream Status Word                 | 1   | 2      |
|   | 2.2   | Branches and Targets               | 1.  | -<br>? |
| 3 | Inc   | tructions                          | • 1 | ,      |
| J |       |                                    | 15  | j      |
|   | 3.1   | Lookahead                          | 13  | j      |
| 4 | Cor   | ndition Codes                      |     |        |
|   | 4.1   | Select Operations                  | 17  |        |
|   |       |                                    | 19  |        |
| 5 | Floa  | ating-point Arithmetic             | 20  |        |
|   | 5.1   | Floating-point Formats             | 20  |        |
|   | 5.2   | Rounding                           | 21  |        |
|   | 5.3   | Floating-point Exceptions          | 21  |        |
| 6 | Data  | a Memory                           | 21  |        |
|   | 6.1   | •                                  | 23  |        |
|   |       | Data Memory Access                 | 23  |        |
|   | 6.2   | Data Memory Address Translation    | 26  |        |
|   | 6.3   | M-unit Internal State              | 31  |        |
|   | 6.4   | Speculative Loads                  | 32  |        |
| 7 |       | ram Memory                         |     |        |
|   |       | ·                                  | 33  |        |
|   | 7.2   | Program Memory Address Translation | 33  |        |
|   | 1.4   | The Instruction Cache              | 36  |        |
| 8 | Leve  | ls and Protection Domains          | 20  |        |
|   |       | Levels                             | 38  |        |

|    | \$.2   | Protection Domains           | 38        |
|----|--------|------------------------------|-----------|
|    | 8.3    | Stream Resource Control      | 39        |
|    | 8.4    | Data State Descriptor        | 39        |
|    | 8.5    | Program State Descriptor     | 40        |
| _  | _      | 1 m                          |           |
| 9  |        | eptions and Traps            | 43        |
|    | 9.1    | Exceptions                   |           |
|    | 9.2    | Traps                        | 48        |
| 10 | Reso   | ource Counters               | 49        |
|    | 10.1   | Instruction Counter          | 49        |
|    | 10.2   | Protection Domain Counters   | 49        |
|    | 10.3   | Processor Counters           | 51        |
|    | _      |                              |           |
| 11 | -      |                              | <b>52</b> |
|    |        | Notation                     |           |
|    |        | Operation Naming Conventions |           |
|    |        | Pseudo-code Operators        |           |
|    |        | perations                    |           |
|    |        | Operations                   |           |
|    |        | Memory Operations            |           |
|    |        | ain Operations               |           |
|    | Excep  | ption Operations             | 84        |
|    | Float  | Operations                   | 85        |
|    | Intege | er Operations $\ldots$       | 08        |
|    | Jump   | Operations                   | 30        |
|    |        | al Operations                |           |
|    |        | peration Operations          |           |
|    |        | e Operations                 |           |
|    |        | am Cache Operations          |           |
|    |        | am Map Operations            |           |
|    |        | er Operations                |           |
|    |        | t Code Operations            |           |
|    | Rotat  | e Operations                 | -0<br>47  |
|    |        | Operations                   |           |
|    |        | Operations                   | ±0        |

|    | Skip Operations                                  | 1 |
|----|--------------------------------------------------|---|
|    | State Control Operations                         |   |
|    | Store Operations                                 |   |
|    | Stream Operations                                |   |
|    | Resource Accounting Operations                   |   |
|    | Level Manipulations Operations                   |   |
|    | Unsigned Operations                              |   |
| _  |                                                  |   |
| 1  | 2 Programming Examples 180                       |   |
|    | 12.1 Stream Creation                             |   |
|    | 12.2 Forwarding Pointers                         |   |
|    | 12.3 Vector Loops                                |   |
|    | 12.4 Doubled Precision Floating-point Arithmetic |   |
|    | 12.5 Floating-point Division and Square Root     |   |
|    | 12.6 Integer Division                            |   |
| 13 | 3 I/O Processor Introduction 194                 |   |
|    | 13.1 Link Status Word                            |   |
|    |                                                  |   |
| 14 | I/O Operation Descriptions                       |   |
|    | Initialization Operations                        |   |
|    | Memory Load Stream Operations                    |   |
|    | Memory Store Stream Operations                   |   |
|    | HIPPI Out Stream Operations                      |   |
|    | HIPPI In Stream Operations                       |   |
| 15 | I/O Processor Examples                           |   |
|    | 15.1 Loading Memory                              |   |
|    | 15.2 Sending Data                                |   |
|    | 15.3 Receiving Data                              |   |
|    | 15.4 Storing Memory                              |   |
|    |                                                  |   |
| A  | Operation Encoding Summary 233                   |   |
|    | A.1 M OPs                                        |   |
|    | A.2 MC OPs                                       |   |
|    | A.3 A OPs                                        |   |
|    |                                                  |   |

| In | dex                       | 245   |
|----|---------------------------|-------|
|    | C.1 Scrambling Matrices   | . 242 |
| С  | GF(2) Addressing Matrices | 242   |
| В  | Processor State           | 241   |
|    | A.6 I OPs                 | . 239 |
|    | A.5 MAC OPs               | . 239 |

# List of Figures

| 6.1          | Data Mapping Logic Block Diagram    | 27       |
|--------------|-------------------------------------|----------|
| 7.1          | Program Mapping Logic Block Diagram | 34       |
| 11.1<br>11.2 | Data Type Prefixes                  | 55<br>57 |

# Structures

| StreamStatusWord       |
|------------------------|
| Operation              |
| Float64 20             |
| Float32                |
| SpecialFloat64         |
| AccessState            |
| Pointer                |
| OperationAccessControl |
| DataAddress 26         |
| DataMapEntry           |
| DomainDataAddress      |
| DomainDataTLBAddress   |
| DataControlDescriptor  |
| ProgramAddress         |
| ProgramMapEntry        |
| DomainProgramAddress   |
| ProgTlbAddr            |
| ProgL2Address          |
| L1Address              |
| L2Address              |
| DataStateDescriptor    |
| ProgramStateDescriptor |
| ExceptionRegister      |
| ResultCode             |
| EventSelect            |
| OPStatusWord           |

# Enumerations

| Cond.Mask        | 1          |
|------------------|------------|
| IntSelect        | 10         |
| FloatSelect      | 10         |
| RoundMode        | 2          |
| NaNResultCode    | 2.         |
| FullEmptyControl | 24         |
| Resource         | 21         |
| RetryOpCode      | 21         |
| Level            | 21         |
| Exception        | 42         |
| FloatResultCode  | 40         |
| DataResultCode   | 46         |
| CountSource      | 47         |
| ProbeControl     | 50         |
| ProbeControl     | <b>3</b> 8 |
| opStream         | 05         |

Introduction 9

# Chapter 1: Introduction

### 1.1 Notation

This document defines structure and enumeration data types for use by system software. An enumeration definition names the enumeration and the members, gives the integral value of each member, and may give one or more columns of commentary.

A structure definition names the structure (typically of a hardware register) and describes the fields. Fields are written using the notation "Bits hbn-lbn" where hbn is the high bit number and lbn is the low bit number. The width of the field is hbn-lbn+1. Each field has a field name, a type, and one or more columns of commentary text. The field type is either a predefined type or an enumeration type declared elsewhere.

The enumeration names, enumeration members, structure names, field names, and base types all appear in the index. The enumerations and structures defined in this manual are available for use in assembly language and C programs, including the assembler and compilers themselves.

The notation used for operations and instructions is described in §11.1.

Some descriptions in this document include program fragments. Fragments are formatted so that keywords appear in bold face and comments appear in italics.

## 1.2 Data Types

The memory system can load and store eight-bit bytes, 16-bit quarterwords (2 bytes), 32-bit half-words (4 bytes), or 64-bit words (8 bytes). Bits are numbered from right to left: the least significant bit is bit number 0.

The most important architecturally supported data types are these:

#### bit vector

A bit vector may be of any length and may span one or more word boundaries.

#### signed integer

Signed integers are interpreted in two's complement. Byte, quarterword, and halfword signed integers are sign-extended to 64 bits when they are loaded and quietly truncated to the proper length when they are stored.

#### unsigned integer

Byte, quarterword, and halfword unsigned integers are zero-extended to 64 bits when they are loaded and quietly truncated to the proper length when they are stored.

#### floating point

Floating-point numbers and operations conform to IEEE Standard 754. Single (32-bit) and double (64-bit) basic formats are supported. Support for a 128-bit floating-point format is also provided.

### pointer

A pointer has two subfields. The most significant 16 bits is the access control field, described in §6.1. The remaining 48 bits make up the address field, described in §6.2.

#### instruction

Instructions, composed of operations, are described in §3.

### stream status word

A stream status word (ssw), contains status and control information for the instruction stream in its upper halfword and a program counter in the lower halfword. It is described in §2.1.

### resource counter

The processor counts interesting events for accounting and performance monitoring. They are described in  $\S 10$ .

Several data types are derived from type Boolean, a single-bit unsigned type, where 0 is false and 1 is true. The name of each derived type is a mnemonic to help interpret what the bit controls when active—namely when it is set, is true, or is assigned 1, all of which are equivalent terms. For example, a variable of type Flag notes that an exception has occurred if it is set; a variable of type SignBit indicates a negative number if it is set.

Several other types are implicitly derived from type Uns, an unsigned datum of length at most 64 bits, as shown below:

| type           | width(bits) | base type | description                                                                                                                                                                                          |
|----------------|-------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Reg            | 5           | Uns       | a register number                                                                                                                                                                                    |
| ProgramAddrUns | 32          | Uns       | a ProgramAddress structure treated as an Uns: see §2.1 a virtual program page address a physical memory offset                                                                                       |
| PageNumber     | 20          | Uns       |                                                                                                                                                                                                      |
| ProgFrame      | 17          | Uns       |                                                                                                                                                                                                      |
| DataAddrUns    | 48          | Uns       | a DataAddress treated as an Uns; see §6.1 a virtual data segment number an offset into a virtual data segment a physical memory offset specified as a frame number or a physical memory frame number |
| DataSegment    | 20          | Uns       |                                                                                                                                                                                                      |
| SegmentOffset  | 15          | Uns       |                                                                                                                                                                                                      |
| DataFrame      | 19          | Uns       |                                                                                                                                                                                                      |

# 1.3 Storage Classes

Each stream has available a number of different kinds of storage.

• There is a large amount of memory, all of it potentially available to any stream on any processor in the system. Data memory units adjacent to the referencing stream's processor have relatively low latency. This adjacent data memory is referred to as "local" and is currently used only to store instructions, data maps, and program maps for the local processor. Most

Introduction 11

data memory accesses are distributed across the entire system. The part of data memory that stores instructions for its processor, is sometimes called "program memory". Every word in data memory has a four-bit access state, which modifies the behavior of memory references to any part of the word: see §6.1.

- The 31 general-purpose registers are used as the sources and destination for almost all operations. Register 0 always reads as 64 bits of 0, and values written into it are discarded.
- The stream status word (SSW) contains condition codes, the trap mask, the mode, and the program counter. The SSW is described in §2.1.
- The eight target registers contain program addresses and are used as arguments for branch operations; Target 0 points to the trap handler. See §2.2.
- The exception register flags the exception(s) that have been detected and raised. A raised exception will cause a trap if the trap is not disabled by the appropriate bit in the trap mask of the stream status word. The exception register also contains the register poison flags. See §9.1.
- The result code register describes exceptional result values from the function units: see §9.1.
- The trap registers are used by the trap handler to save the state of the trapping stream. The trap registers are described in §9.2.

# Chapter 2: Streams

Each physical processor supports a variable number of instruction streams, or streams for short. Each stream appears to be (and is programmed like) a wide-instruction RISC processor. The processor hardware selects streams for execution and executes a single instruction from each in turn. Streams are allocated, created, and destroyed dynamically; the active streams are multiplexed by the processor hardware onto a single set of pipelined functional units.

Streams may be active or idle. An active stream competes with other streams to issue instructions, while idle streams do not. A stream is activated and initialized with a skeleton execution environment by the unprivileged STREAM\_CREATE operation. Unprivileged STREAM\_RESERVE operations are used to reserve a number of idle streams for subsequent activation by STREAM\_CREATE. The STREAM\_QUIT operation returns a stream that executes it to the idle state.

A stream executes at one of four privilege levels: user, supervisor, kernel, or IPL. The privilege level of a stream determines the operations it may execute and the kinds of memory access it is permitted. Levels are described further in §8.1.

Each active stream in a processor belongs to one of sixteen protection domains. A protection domain has registers that limit the number of streams it can contain and define the memory accesses available to its streams. Protection domains are described in more detail in §8.2.

### 2.1 Stream Status Word

The stream status word (ssw) is shown below. The ssw contains the condition codes from the most recent four "\_TEST" operations; a trap mask which selectively disables traps from raised exceptions; a mode field describing how arithmetic, memory references, and lookahead are to be done; and a program counter containing the address of the instruction being executed.

| Bits                                | Wd | Field Name | Туре     | Description                                                                                                               |  |  |  |  |  |
|-------------------------------------|----|------------|----------|---------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| StreamStatus Word: Condition Vector |    |            |          |                                                                                                                           |  |  |  |  |  |
| 63-61                               | 3  | cc_3       | CondCode | condition code CV3: result from fourth                                                                                    |  |  |  |  |  |
| 60–58                               | 3  | cc_2       | CondCode | most recent _TEST operation; see §4 condition code CV2: result from third                                                 |  |  |  |  |  |
| 57-55                               | 3  | cc_l       | CondCode | most recent _TEST operation; see $\S4$ condition code $CV_1$ : result from sec-                                           |  |  |  |  |  |
| 54-52                               | 3  | cc_0       | CondCode | ond most recent _TEST operation; see §4  condition code CV <sub>0</sub> : result from most recent _TEST operation; see §4 |  |  |  |  |  |
|                                     |    |            |          | operation, see 94                                                                                                         |  |  |  |  |  |

| St1   | reamSta                | tus V | Vord: Trap Mask                 |                |                                                                                                                      |  |  |  |
|-------|------------------------|-------|---------------------------------|----------------|----------------------------------------------------------------------------------------------------------------------|--|--|--|
|       | 51                     | 1     | hardware_trap<br>disable        | Boolean        | disable hardware traps                                                                                               |  |  |  |
|       | 50                     | 1     | system_trap<br>disable          | Boolean        | disable system traps                                                                                                 |  |  |  |
|       | 49                     | 1     | domain_signal<br>trap_disable   | Boolean        | disable domain signal traps                                                                                          |  |  |  |
|       | 48                     | 1     | user_trap_disable               | Boolean        | disable user traps                                                                                                   |  |  |  |
|       | 47-45                  | 3     | 0                               |                | reserved                                                                                                             |  |  |  |
|       | 44                     | 1     | float_invalid<br>trap_disable   | Boolean        | disable float invalid trap                                                                                           |  |  |  |
|       | 43                     | 1     | float_zero_div<br>trap_disable  | Boolean        | disable float zero divide trap                                                                                       |  |  |  |
|       | 42                     | 1     | float_overflow<br>trap_disable  | Boolean        | disable float overflow trap .                                                                                        |  |  |  |
|       | 41                     | 1     | float_underflow<br>trap_disable | Boolean        | disable float underflow trap                                                                                         |  |  |  |
|       | 40                     | 1     | float_inexact<br>trap_disable   | Boolean        | disable float inexact trap                                                                                           |  |  |  |
| Stre  | StreamStatusWord: Mode |       |                                 |                |                                                                                                                      |  |  |  |
|       | 39                     | 1     | 0                               |                | reserved                                                                                                             |  |  |  |
|       | 38                     | 1     | ssw_override                    | Boolean        | disables all traps, lookahead, and the instruction counter: allows some memory operations to retry forever: see §9.2 |  |  |  |
|       | 37                     | 1     | spec_load_enable                | Boolean        | allows loads to be speculative; see §6.4                                                                             |  |  |  |
|       | 36                     | 1     | unaligned_data<br>enable        | Boolean        | prevents unaligned data from raising the data_alignment exception: see §6.1                                          |  |  |  |
|       | 35                     | 1     | lookahead_disable               | Boolean        | disables lookahead, so that each memory operation finishes before the next instruction is issued; see §3.1.          |  |  |  |
|       | 34                     | 1     | count_disable                   | Boolean        | disables the instruction counter; see §10                                                                            |  |  |  |
|       | 33-32                  | 2     | round_mode                      | RoundMode      | floating-point rounding mode; see §5.2                                                                               |  |  |  |
| Stree | amStati                | ıs Wo | ord: Program Count              | er             | <u> </u>                                                                                                             |  |  |  |
|       |                        | 32    |                                 | ProgramAddrUns | the program counter                                                                                                  |  |  |  |

The field "pc", shown here as a type ProgramAddrUns, is actually a structure of type ProgramAddress, used in program address translation; see §7.1.

# 2.2 Branches and Targets

There are two major families of branch operations. The JUMP family is intended for general long-distance transfers including subroutine calls. The SKIP family adds a small positive offset to the

### 2.2 Branches and Targets

program counter and is intended for the short forward transfers needed in if-then-else situations. The JUMP and SKIP families have variants for terminating lookahead if the branch is or is not taken: see §3.1.

Jumps are performed in two distinct operations. First, a TARGET operation loads a target register with a program address. Second, the JUMP operation is executed, conditionally setting the ssw.pc to the contents of the specified target register. Separating these two concerns lets the processor prefetch instructions down an execution path that may be taken in the future. Loading target registers with invalid addresses will not raise an exception unless and until the target register is used in a successful JUMP operation.

There are eight target registers. Each target register contains a program counter; see §2.1. Target register T0 is reserved for the address of the trap handler. It is automatically exchanged with the ssw.pc on a trap, and can be written by unprivileged streams (unless the "priv\_t0" bit in the program state of the protection domain prohibits it). When a target register is loaded, the program cache attempts to prefetch the line containing the new address; see §7.2.

Instructions 15

## Chapter 3: Instructions

Every instruction is 64 bits long, and generally contains four fields describing lookahead, an Moperation, an A-operation, and a C-operation. These fields are shown here.

| Bits      | Wd | Field Name | Type        | Description |
|-----------|----|------------|-------------|-------------|
| Operation |    |            |             |             |
| 63-61     | 3  | la         | Uns         | lookahead   |
| 60-47     | 14 | Mop        | ${	t Uns}$  | M-operation |
| 46-21     | 26 | Aop        | ${	t Uns}$  | A-operation |
| 20-0      | 21 | Сор        | ${\tt Uns}$ | C-operation |

The lookahead field is used to control M-unit operation overlap and is described in §3.1. In general, an M-unit operation (M-operation) accesses memory in some way, an A-unit operation (A-operation) performs arithmetic, and a C-unit operation (C-operation) is primarily responsible for control flow. The C-operation can also do some arithmetic operations, exclusive of multiplication. Nearly every arithmetic operation that can be done in a C-operation can also be done by an A-operation.

Some operations are encoded by combining multiple operation fields. For example, an MC-operation such as INT\_LOAD\_DISP uses both the M- and C-operation fields. STREAM\_CREATE and STREAM\_QUIT are MAC-operations.

The operations in an instruction are decoded in parallel. If any of them is invalid, either because it is a privileged operation at the current protection level or it is an illegal operation encoding, a privileged operation exception is raised, and no part of the instruction is issued.

The decoded operations are executed in parallel. All operands for all operations in the instruction are read before any result is written. Results are written in an implementation-dependent order, so if more than two operations in an instruction write to the same destination register, the resulting value is undefined. Thus, such an instruction is illegal. Once instruction execution is begun the destination registers are always written, regardless of whether or not the operation later raises an exception or traps.

The program counter (PC) follows the same rule for reading and writing as the operands. The PC is read when the instruction is issued and is written when the instruction completes. The written value is either an incremented value for normal sequential flow or a new value from a branch.

The individual operations are described in §11.

### 3.1 Lookahead

The lookahead field is a three-bit unsigned integer that the code generator must guarantee to be less than or equal to the minimum number of instructions that the stream might execute before

encountering one that depends on the current M-, MC-, or MAC-operation. The maximum possible lookahead value is seven. If there is no such operation, the code generator should set the lookahead to the maximum of seven. If the code generator is ignorant of the relevant dependences, the lookahead may be set to zero. The lookahead must take into account all branch paths that are lookahead-enabled, as described below.

An instruction J depends on an M-unit operation (M-, MC-, or MAC-operation) at a prior instruction I if any operation in J uses or defines a register or a part of a register implicitly or explicitly defined by the M-unit operation in I. In addition, an instruction J depends on an M-unit operation at a prior instruction I if the M-unit operation in J references some of the same memory referenced in I and the memory is modified by either or both of I and J. These definitions are manifestations of standard data dependence rules.

The lookahead field supplies the hardware with an upper bound on the number of additional instructions that may begin execution before the current M-unit operation is finished. For example, if lookahead is zero throughout a program, then the processor will finish each M-, MC-, or MAC-operation before starting the next instruction. Lookahead can be disabled to get the same effect by setting the mode bit field "lookahead disable" in the ssw.

Branch paths are determined by branching operations and their corresponding skip amounts or target registers. All conditional branch operations have variants that disable lookahead on one of the two paths. The "SELDOM" branch operations (JUMP\_SELDOM, SKIP\_SELDOM) disable lookahead when the transfer is taken, and the "\_OFTEN" branch operations (JUMP\_OFTEN, SKIP\_OFTEN) disable lookahead when the transfer is not taken. The effect of disabling lookahead is to require all outstanding memory references to complete before the next instruction is allowed to execute.

Condition Codes 17

## Chapter 4: Condition Codes

Many operations have alternate versions (with "LTEST" appended to the mnemonic) that generate a condition code in addition to a value in a register. The eight possible condition code values and their default meanings are shown below, where 0, p, and n stand for zero, a positive integer, and a negative integer, respectively.

| Name                     | Value | Meaning                     | Examples             |
|--------------------------|-------|-----------------------------|----------------------|
| CondCode<br>COND_ZERO_NC | 0     | Zero, no carry              | 0 = 0 + 0            |
| COND_NEG_NC              | 1     | Negative, no carry          | n=p+n,n=p-p          |
| COND_POS_NC              | 2     | Positive, no carry          | p = p + p, p = p - n |
| COND_OVFNAN_NC           | 3     | Overflow/NaN, no carry      | n = p + p, n = p - n |
| COND_ZERO_C              | 4     | Zero, carry Negative, carry | 0 = n + p, 0 = n - n |
| COND_NEG_C               | 5     |                             | n = n + n, n = n - p |
| COND_POS_C               | 6     | Positive, carry             | p = n + p, p = n - n |
| COND_OVFNAN_C            | 7     | Overflow/NaN, carry         | p = n + n, p = n - p |

Each newly generated condition code is inserted as CV<sub>0</sub> at the low end of the four-element condition vector CV associated with the stream; the existing codes shift over and the old value of CV<sub>3</sub> is lost. If multiple operations in the same instruction generate condition codes (because there are multiple "\_TEST" suffixes), then the condition code from the C-operation is inserted first, followed by the condition code from the A-operation.

After integer arithmetic operations, the condition code describes the sign of the result in the obvious way unless overflow has occurred, in which case the result sign is negative if and only if there was no carry. Some integer and bit operations—such as INT\_MAX and BIT\_RIGHT\_ONES— generate the carry bit in a nonstandard way; for these operations overflow/NaN is not generated, and the condition code still accurately reflects the sign of the result.

After floating-point operations, the condition code describes the result in a way compatible with IEEE Standard 754. See §5 describing floating-point arithmetic.

A condition mask, shown as *cond* in the operation descriptions, describes a set of condition code values by summing the powers of two corresponding to the codes in the set, typically to determine whether a branch should take place. A *cond* can describe any combination of condition codes. For example, the condition mask named IF EQ (if equal) describes codes 0 and 4, so it has the value  $2^0 + 2^4$ , which is 0x11 or  $11_{16}$ .

Most of the important condition masks have one or more names. The named condition masks are shown below.

4.0 Lookahead CondMask

| Name                                 | Value       | After (SUB_TEST x y z)                 |
|--------------------------------------|-------------|----------------------------------------|
| CondMask: Manifest                   |             |                                        |
| IF_ALWAYS                            | 01234       | 5 always                               |
| IP MAKED                             | 6 7         |                                        |
| IF_NEVER                             |             | never                                  |
| CondMask: Equality                   |             |                                        |
| IF_EQ                                | 0 4         | y = 2 (integer weeken 1 0)             |
| IF_ZE                                | 0 4         | y = z (integer, unsigned, float)       |
| IF_F                                 | 0 4         | x = 0 (integer, unsigned, float)       |
| IF_NE                                | 123567      | x = 0  (logical)                       |
| IF_NZ                                |             | by (Bari gunipuca, nogr)               |
| IF_T                                 | 123567      | (Bori ansigned, noat)                  |
|                                      | 1 2 3 5 6 7 | $x \neq 0$ (logical)                   |
| CondMask: Integer Comparison         |             |                                        |
| IF_ILT                               | 157         | y < z (integer)                        |
| IFLIGE                               | 02346       | $y \ge z$ (integer)                    |
| IF_IGT                               | 236         | $y \ge z$ (integer)<br>y > z (integer) |
| IF_ILE                               | 01457       | $y \le z$ (integer)                    |
| IF_IMI                               | 135         | •                                      |
| IF_IPZ                               | 02467       | x < 0  (integer)                       |
| IF_IPL                               | 267         | $x \ge 0 \text{ (integer)}$            |
| IF_IMZ                               | 01345       | x > 0 (integer)                        |
|                                      | 01343       | $x \leq 0 \text{ (integer)}$           |
| CondMask: Unsigned Comparison        |             |                                        |
| IF_ULT                               | 123         | y < z (unsigned)                       |
| IF_UGE                               | 04567       | $y \ge z$ (unsigned)                   |
| IF_UGT                               | 567         | y > z (unsigned)                       |
| IF_ULE                               | 01234       | $y \le z$ (unsigned)                   |
| G                                    |             | 3 7 2 (qualgred)                       |
| CondMask: Float Comparison<br>IF_FLT |             |                                        |
| •                                    | 1 5         | y < z (float)                          |
| IF_FGE                               | 0 2 4 6     | $y \ge z$ (float)                      |
| IF_FGT                               | 2 6         | y > z (float)                          |
| IF_FLE                               | 0 1 4 5     | $y \le z$ (float)                      |
| CondMask: Other Tests                |             | ·                                      |
| IF LOV                               | 3 7         |                                        |
| IF_FUN                               | 3 7         | x overflowed (integer)                 |
| IF_CY                                | 37          | y and $z$ are unordered (float)        |
| IF_NC                                | 4567        | сатту                                  |
|                                      | 0 1 2 3     | по сатту                               |

| Cond. Mask: Specific Cond | ditions |                        |
|---------------------------|---------|------------------------|
| IF_O                      | 0       | Zero, no carry         |
| IF_1                      | 1       | Negative, no carry     |
| IF.2                      | 2       | Positive. no carry     |
| IF_3                      | 3       | Overflow/NaN, no carry |
| IF_4                      | 4       | Zero, carry            |
| IF_5                      | 5       | Negative, carry        |
| IF_6                      | 6       | Positive, carry        |
| IFL7                      | 7       | Overflow/NaN, carry    |

### 4.1 Select Operations

The SELECT\_operations use three-bit encodings to specify one of eight of the most common condition masks. SELECT\_INT uses a mask IntSelect that encodes integer and unsigned comparisons as shown below. Additional selects can be realized by reversing the arguments u and v of the SELECT\_INT operation itself.

| Name      | Value | After (SUB_TEST x y z)           |   |
|-----------|-------|----------------------------------|---|
| IntSelect |       |                                  | - |
| SEL_CY    | 0     | carry                            |   |
| SEL_EQ    | 1     | y = z (integer, unsigned. float) |   |
| SELLIGT   | . 2   | y > z (integer)                  |   |
| SELIGE    | 3     | $y \ge z \text{ (integer)}$      |   |
| SELLUGT   | 4     | y > z (unsigned)                 |   |
| SELLUGE   | 5     | $y \ge z$ (unsigned)             |   |
| SEL_IPL   | 6     | x > 0 (integer)                  |   |
| SELIPZ    | 7     | $x \ge 0$ (integer)              |   |

The SELECT\_FLOAT operation uses the encoding FloatSelect as shown below.

| Name        | Value | After (FLOAT_MIN_TEST x y z)    |
|-------------|-------|---------------------------------|
| FloatSelect |       |                                 |
| SEL_FLT     | 2     | y < z (float)                   |
| SEL_FLE     | 3     | $y \le z$ (float)               |
| SELFGT      | 4     | y > z (float)                   |
| SELFGE      | 5     | $y \ge z$ (floar)               |
| SEL_FUN     | . 6   | y and $z$ are unordered (float) |

An IntSelect or FloatSelect enumeration describes the same condition code set as the identically suffixed CondMask.

# Chapter 5: Floating-point Arithmetic

# 5.1 Floating-point Formats

The IEEE Standard 754 floating-point double basic format (64 bit) and single basic format are supported. This is the structure of a normal 64-bit floating-point number:

| -    | Bits    | Wd    | Field Name        | Type                | Description     |
|------|---------|-------|-------------------|---------------------|-----------------|
| Floa | 2164    |       |                   |                     |                 |
|      | 63      | 1     | sign              | SignBit             | sign bit        |
|      | 62-52   | 11    | exponent          | Uns                 | biased exponent |
|      | 51-0    | 52    | fraction          | Uns                 | fraction        |
| 3    | This is | the s | ructure of a nor. | mal 32-bit floating | g-point number: |
| _    | Bits    | Wd    | Field Name        | Туре                | Description     |
| Floa | t32     |       |                   |                     |                 |
|      | 31      | 1     | sign              | SignBit             | sign bit        |
|      | 30-23   | 8     | exponent          | Uns                 | exponent        |
|      | 22–0    | 23    | fraction          | Uns                 | fraction        |
|      |         |       |                   |                     |                 |

Doubled precision addition, subtraction, multiplication, and conversion operations to and from the single precision IEEE 754 format and both signed and unsigned integer formats are supported directly. Division and square root are accomplished with the help of iterative computation primitives that use a special floating-point format providing extra significand precision:

| Bits          | Wd       | Field Name        | Туре       | Description              |
|---------------|----------|-------------------|------------|--------------------------|
| SpecialFloo   | 1 1      | sign              | SignBit    | sign bit                 |
| 62-53<br>52-0 | 10<br>53 | exponent fraction | Uns<br>Uns | biased exponent fraction |

The SpecialFloat64 exponent is biased by 510, so that the true exponent is the biased exponent minus 510.

All operations conform to the applicable IEEE standard.

Floating-point comparison operations set the condition code to indicate whether the operands are equal, greater, less, or NaN. The carry bit indicates the second operand  $(v \lor z)$  is a NaN. The float-invalid exception is never raised by FLOAT\_MIN or FLOAT\_MAX. When the compare is performed by a FLOAT\_CMP\_TEST, float\_invalid is raised when the operands are unordered. Thus, IEEE 754 tests which do not raise an exception on unordered operands, such as a test for equality, should

Special 64-bit Floating-point Format

be implemented using FLOAT\_MIN\_TEST. Tests for inequalities such as greater than should use FLOAT\_CMP\_TEST to properly handle unordered operands.

Support for fast doubled precision arithmetic is provided. In doubled precision, a pair of 64-bit floating-point numbers is used to hold twice the significant digits and provide at least twice the precision of ordinary 64-bit floating point. There are provisions to compute the doubled precision sum, difference, and product efficiently. See the doubled precision programming examples in §12.4.

### 5.2 Rounding

Unless explicitly specified otherwise in an operation description, rounding is performed according to the rounding mode stored in field "round\_mode" in the ssw. The rounding modes are shown here.

| Name      | Value | Meaning                |  |
|-----------|-------|------------------------|--|
| RoundMode |       |                        |  |
| RND_NEAR  | 0     | round to nearest       |  |
| RND_CHOP  | 1     | round toward zero      |  |
| RND_FLOOR | 2     | round toward $-\infty$ |  |
| RND_CEIL  | 3     | round toward $\infty$  |  |

Rounding is explicitly specified in some convert operations, such as FLOAT\_CEIL, INT\_CHOP, and UNS\_FLOOR.

## 5.3 Floating-point Exceptions

Floating-point exceptions are raised as a side effect of operation completion. The destination register of the operation is set in accordance with the IEEE Standard.

Besides the 64-bit result in the destination register, a floating-point exception records the destination register number and a four-bit floating-point result code in the result code register. A nonzero result code indicates that the destination register contains an exceptional value and summarizes that value. Floating-point result codes are described in §9.1. Note that a zero destination register will not allow the exceptional value to be saved.

In conformance with IEEE Standard 754, an invalid operation exception is raised (and a trap potentially taken) when a conditional test operation encounters a NaN when performing an inequality test as described in §5.2.

If overflow or underflow traps are disabled, then overflow delivers infinity or a maximum magnitude floating-point value, depending on the rounding mode, and underflow delivers a denormalized result. When floating-point overflow or underflow traps are enabled, the result in the destination register is the same as the masked response, so that the trap handler may report the program state and resume execution. Note that underflow is only raised when the result is inexact and subnormal, whether the underflow trap is enabled or not. The check for subnormal is before rounding, so the final result may actually be normalized (due to rounding). A float\_zero\_divide exception always returns a properly signed infinity. A float\_extension exception returns the argument to the operation (they are all unary) so that the trap handler may easily locate the value and complete the operation.

A float invalid exception generates a NaN value in accordance with the IEEE standard. A NaN generated by an operation describes the reason for the exception using the enumeration below. The appropriate code is stored in the low three bits of the fraction.

| Name              | Value | Meaning                             |
|-------------------|-------|-------------------------------------|
| NaNResultCode     |       |                                     |
| NAN_ZERO_MUL_INF  | 1     | Zero times infinity                 |
| NAN_INF_SUB_INF   | 2     | Magnitude subtraction of infinities |
| NAN ZERO DIV ZERO | 4     | Zero divided by zero                |
| NAN_INF_DIV_INF   | 5     | Infinity divided by infinity        |
| NAN_SQRT_NEG      | 6     | Square root of negative number      |

The high-order fraction bits of a NaN are zero; these bits are Bits 51-3 for a normal 64-bit floating-point number, and Bits 22-3 for a 32-bit floating-point number. The destination register (t or x) is stored in the result code register, so that the NaN may be examined by the trap handler for diagnosis or continuation.

There are no signaling NaNs, but data trap bits provide a more comprehensive mechanism; see §6.1.

## Chapter 6: Data Memory

A Tera system has either two or four data memory units per processor. When four units per processor are configured, the additional two units are referred to as "expanded data memory". Data is accessed by LOAD, STORE, FETCH\_ADD, and STATE operations. This chapter describes what is stored in data memory, the semantics of accessing it, the address translation mechanism, and finally the internal state of the M functional unit. The M-unit can simultaneously process up to eight pending requests for data memory access by each stream.

### 6.1 Data Memory Access

Every data memory cell contains a 64-bit value and a four-bit access state. The value in a memory cell can be addressed as a word, 2 halfwords, 4 quarterwords, or 8 bytes. The order of bytes in quarterwords, quarterwords in halfwords, and halfwords in words is "big-endian", i.e. packed so that addresses increase as significance decreases. Thus the word at address A, read from left to right, most significant bit to least, contains bytes with addresses A, A + 1, ... A + 7; quarterwords with addresses A, A + 2, A + 4, and A + 6; and halfwords with addresses A and A + 4.

The access state modifies the behavior of memory references to the word or partial word contained in the cell. It has this structure:

|            | Bits    | Wd | Field Name     | Туре    | Description        |  |
|------------|---------|----|----------------|---------|--------------------|--|
| $A \infty$ | essStat | e  |                |         |                    |  |
|            | 3       | 1  | full           | Boolean | full/empty bit     |  |
|            | 2       | 1  | forward_enable | Boolean | forward enable     |  |
|            | 1       | 1  | trap1_enable   | Boolean | data trap 1 enable |  |
|            | 0       | 1  | trap0_enable   | Boolean | data trap 0 enable |  |

The operations STATE\_LOAD, STATE\_STORE, and STATE\_LOCK are respectively used to load, store, and lock the access state.

Operations always access data memory relative to a pointer. The semantics of memory access are determined by an access control field in this pointer, possibly overridden by an access control field in the operation, and by the access state of the addressed memory cell(s). Briefly, the access can be forced to wait until the cell is either empty or full, a data blocked exception can be raised in response to load or store accesses to the cell, and a memory cell can forward accesses to another memory cell.

A pointer has two parts: an access control part, which modifies access through the pointer, and an address part. The fields in a pointer are as follows:

| _    | Bits     | Wd    | Field Name          | Type             | Description                  |
|------|----------|-------|---------------------|------------------|------------------------------|
| Poi  | nter: ac | cess  | control             |                  |                              |
|      | 63       | 1     | 0                   |                  | reserved                     |
|      | 62       | 1     | fwd_disable         | Boolean          | forwarding disable           |
|      | 61-60    | 2     | fe_control          | FullEmptyControl | full/empty control           |
|      | 59       | 1     | trap1_store disable | Boolean          | data trap 1 disable on store |
|      | 58       | 1     | trapl_load_disable  | Boolean          | data trap 1 disable on load  |
|      | 57       | 1     | trap0_store disable | Boolean          | data trap 0 disable on store |
|      | 56       | 1     | trap0_load_disable  | Boolean          | data trap 0 disable on load  |
|      | 55-48    | 8     | 0                   |                  | reserved                     |
| Poir | nter: ad | dress |                     |                  |                              |
|      | 47-0     | 48    | address             | DataAddrUns      | data memory address          |

The field "address", shown here as a type DataAddrUns, is actually a structure of type DataAddress, used in data memory address translation; see §6.2.

The value in the field "fe\_control" is of type FullEmptyControl, described here. In the description of the load and store behaviors, the term "waits for empty (full)" means that the operation waits until the field "full" in the memory cell's full bit becomes false (true); the term "sets empty (full)" means that the field "full" is set to false (true).

| Name             | Value | Behavior                                                                                         |
|------------------|-------|--------------------------------------------------------------------------------------------------|
| FullEmptyControl |       |                                                                                                  |
| FE_NORMAL        | 0     | LOAD loads; STORE stores and sets full                                                           |
| FE_FUTURE        | 2     | LOAD waits for full, then loads;<br>STORE waits for full, then stores                            |
| FE_SYNC          | 3     | LOAD waits for full, then loads and sets empty; STORE waits for empty, then stores and sets full |

Some memory operations encode an access control operand, abbreviated ac, that supersedes the pointer's access control specification. The operation access control structure is shown here:

| Bits      | Wd    | Field Name    | Type             | Description         |
|-----------|-------|---------------|------------------|---------------------|
| Operation | Acces | sControl      |                  |                     |
| 4         | 1     | fwd_disable   | Boolean          | forwarding disable  |
| 3-2       | 2     | fe_control    | FullEmptyControl | full/empty control  |
| 1         | 1     | trap1_disable | Boolean          | data trap 1 disable |
| 0         | 1     | trap0_disable | Boolean          | data trap 0 disable |

Memory reference operations first add the address field from a pointer held in register s to an optional scaled offset. Addition is done modulo  $2^{48}$ . The offset is derived from either another register y or from an unsigned literal displacement disp in the instruction and is then scaled (multiplied)

Operation Access Control Field

Data Memory 25

by the size in bytes of the addressed object. The length of the *disp* field varies so that the scaled offset covers the same set of memory locations independent of object size. This sum is the effective address of a word, halfword, quarterword, or byte in memory.

Then, the effective address is checked against the map limit for this domain. If the limit is exceeded, a data map limit exception is raised.

Unless the field "unaligned\_data\_enable" is set in the SSW, a data alignment exception will be raised when an effective address presented to memory is not a multiple of the number of bytes in the addressed object.

At this point, the data map entry is consulted. If the current privilege level of this stream is insufficient for the map's protection level for this type of operation, a data protection level exception is raised. Otherwise, the segment offset in the effective address is checked against the segment limit in the map entry. If the limit is exceeded, a data segment limit exception is raised.

Next, a data blocked exception is raised if a data trap bit is enabled in the addressed word and the corresponding data trap disable bit is clear in ac or in s when ac is not present.

Forwarding is examined and handled next. If ac is present in this operation, its forward disable bit is used; otherwise that of s is used. If the selected forwarding disable bit is clear and the forward bit is enabled in the addressed word, then the cell may contain a forwarding pointer rather than the data itself. If the cell is forwarded and empty then the operation is retried later; the interpretation is that the forwarding pointer is locked. If the cell is full then the value in the cell is used as an effective word address for another memory access. No registers are modified in this process. The relative word position of a partial word access is unchanged; the three least significant bits of the forwarded effective address are copied from the original address. Data traps at forwarded locations are processed as usual; the data trap disable bits in effect are the original data trap disable bits. The forwarding disable bit in effect is the original (clear) forwarding disable bit. The full/empty control bits are taken without modification from ac or from s when ac is not present.

Finally, synchronization is handled. The full bit in the addressed word is processed in conjunction with the full/empty control bits from ac, or s when ac is not present.

No memory full bit testing occurs if full/empty control is FE\_NORMAL; in this case load operations fetch the value of the addressed word or partial word into register  $\tau$ , and store operations store the contents of  $\tau$  into the addressed word or partial word and set it full.

If full/empty control is FE\_FUTURE or FE\_SYNC, then the memory full bit is tested. If its state is the one waited for, then the load or store of the value occurs, and the memory full bit is changed if full/empty control was FE\_SYNC. Otherwise, the operation is retried later.

When the operation is retried it starts over with the original address (before any forwarding). If the total number of memory cell accesses due to forwarding and retrying exceeds the retry limit in the data state descriptor, a data blocked exception is raised and the operation is aborted. Retries may also be caused by network contention, translation stalls, and other miscellaneous hardware events. When ssw\_override mode is set, all memory operations except synchronizing loads, stores, and int\_fetch\_adds will retry forever and will not raise the data blocked exception.

## 6.2 Data Memory Address Translation

Data memory addresses are found in Bits 47-0 of pointers. A data memory address has this structure:

| _    | Bits        | Wd      | Field Name                       | Туре          | Description                      |
|------|-------------|---------|----------------------------------|---------------|----------------------------------|
| Data | Addre       | ss      |                                  |               | ,                                |
| •    | 47–28       | 20      | data_segment<br>number           | DataSegment   | data segment number              |
| 2    | 27-13       | 15      | data_segment offset              | SegmentOffset | data segment offset              |
| _    | 12-3<br>2-0 | 10<br>3 | data_frame_offset<br>byte_offset | Uns<br>Uns    | data frame offset<br>byte offset |

A complete data memory address is 48 bits long, potentially addressing 256 Terabytes of memory. However, only 42 bits of the address are currently implemented, and the high-order six bits of the data segment number must be zero. The implemented data address space is consequently 4 Terabytes. This space is partitioned into 16K segments, each of which can vary in size from 8 Kbyte to 256 Mbyte in 8 Kbyte increments.

Data memory address translation proceeds in five logical steps. The translation logic block diagram is shown in Figure 6.1. First, the data segment number is validated. Second, the data segment map is accessed, yielding a data segment map entry. Third, the protection level is checked and the segment address is limited and relocated using values in the map entry. This yields a logical address in two parts, a logical unit number and logical unit offset. Fourth, the data frame offset is transformed so that memory references are scrambled, yielding the logical frame offset. Fifth, the logical address is distributed to spread references among the logical units (the memory resources). These steps are now described in more detail.

The first logical step is to validate the data segment number and select the data map to use for translation. If the protection domain's data map limit from the protection domain's data state descriptor is less than the data segment number, a data map limit exception is raised. Otherwise, the resulting segment number and domain number are sent to the data segment map. At this point, the alignment requirements for the selected operation are checked against the effective address. If the reference is unaligned and field "unaligned\_data\_enable" of the ssw is clear, the data alignment exception is raised.

The second step in the translation reads a data map entry from the data segment map. Each entry has the structure shown here:

Data Map Entry

Data Memory 27



FIGURE 6.1: Data Mapping Logic Block Diagram

|     | Bits    | Wd   | Field Name             | Type          | Description                                            |
|-----|---------|------|------------------------|---------------|--------------------------------------------------------|
| Dat | аМарЕ   | ntry |                        |               |                                                        |
|     | 63-62   | 2    | store_level            | Level         | minimum store protection level                         |
|     | 61-60   | 2    | load_level             | Level         | minimum load protection level                          |
|     | 59-57   | 3    | 0                      |               | reserved                                               |
|     | 56      | 1    | stall                  | Boolean       | stall references to this entry                         |
|     | 55      | 1    | locked                 | Boolean       | lock this map entry into TLB                           |
|     | 54      | 1    | distribution<br>enable | Boolean       | distribution                                           |
|     | 53-52   | 2    | memory_type            | Resource      | select data memory, expanded data memory, or IOP units |
|     | 51–48   | 4    | 0                      |               | reserved                                               |
|     | 47-40   | 8    | logical_unit           | Uns           | logical unit number                                    |
|     | 39      | 1    | 0                      |               | reserved                                               |
|     | 38-24   | 15   | segment_limit          | SegmentOffset | segment limit                                          |
|     | <b></b> | •    | 0                      |               | reserved                                               |
| į   | 18-0    | 19   | segment_base           | DataFrame     | segment base                                           |

This map is stored in local data memory, starting at the data map base for the given domain (see §8.4). To speed translation, a translation lookaside buffer (TLB) caches the map entries. Coherency is maintained by flushing modified entries from the cache using the DATA\_MAP\_FLUSH operation. The desired entries to flush are specified with a domain data address, which combines the domain and data address to index the data map. Entries with field "locked" set will only be evicted from the cache by a DATA\_MAP\_FLUSH operation which matches the entry. The implemented data map cache contains 512 entries, with four-way associativity. Within each set, entries are replaced using a least-recently-used policy. To reduce contention between domains, the domain number times eight is exclusive-or'ed with the set index before addressing the TLB. The DATA\_MAP\_FLUSH\_ANY operation treats the TLB as direct mapped, using the low two bits of the tag as the set index, using a domain data TLB address. To flush all entries for a domain from the cache, each entry must be accessed. The flush addresses must sequence through all possible values of the set\_index and set\_number. The domain data address and domain data TLB address structures are shown below:

| Bits          | Wd    | Field Name     | Туре | Description                                    |
|---------------|-------|----------------|------|------------------------------------------------|
| DomainDo      | ıtaAd | dress          |      |                                                |
| 63–60         | 4     | domain         | Uns  | the domain to which this data address pertains |
| <b>59–4</b> 2 | 18    | 00000          |      | reserved                                       |
| 41-35         | 7     | tag            | Uns  | data TLB tag                                   |
| 34-28         | 7     | set_number     | Uns  | data TLB set number                            |
| 27-0          | 28    | segment_offset | Uns  | untranslated bits                              |

Domain Data Address

| Bits     | Wd     | Field Name     | Type           | Description                           |
|----------|--------|----------------|----------------|---------------------------------------|
| DomainDo | ıta TL | BAddress       |                |                                       |
| 63-60    | 4      | domain         | Uns            | the domain to which this data address |
|          |        |                |                | pertains .                            |
| 59-42    | 18     | 00000          |                | reserved                              |
| 41-37    | 5      | tag            | Uns            | data TLB tag                          |
| 36-35    | 2      | set_index      | Uns            | data TLB set index                    |
| 34-28    | 7      | set_number     | $\mathtt{Uns}$ | data TLB set number                   |
| 27-0     | 28     | segment_offset | Uns            | untranslated bits                     |

The third translation step limits and relocates the segmented address. If the current privilege level of the stream storing (loading) data in this segment is less than the minimum store (load) protection level in field "store\_level" (field "load\_level") of the data map entry, then a data protection level exception is raised. Note that many load operations will need store privilege to properly update the access state. However, store operations are implicitly given load privilege to properly follow access control waiting or trapping. That is, the store protection level is assumed to be no higher than the load protection level.

The segment limit, field "segment\_limit", is compared with Bits 27-13 of the data address. If the segment limit is smaller, then a data segment limit exception is raised.

If the field "stall" is set, then the operation is returned to the retry pool and tried again later. This forced retry allows the supervisor to perform some memory management operations without stopping all activity in the domain. Note that the PROBE operation is not affected by field "stall", as its result is determined by the earlier segment limit check.

Otherwise, Bits 27-13 of the data address, extended with zeros on the left to 19 bits, are added to the segment base field "segment\_base" in the data map entry. The sum is sent to the address scrambler as the logical segment offset.

The fourth step scrambles the Bits 21-3 of the data address, producing the logical frame offset. The concatenation of the logical segment offset and the data frame offset is treated as a 29-element vector in  $GF(2)^{29}$ . (GF(2) is the field with elements 0 and 1, and as its multiply operation, and exclusive-or as its addition operation;  $GF(2)^{29}$  is the vector space of dimension 29 over this field.) The vector is scrambled by multiplying it by the unit scrambling matrix, a fixed 29-by-19 bit matrix whose low-order 19-by-19 bit submatrix is invertible. This multiply yields the logical frame offset.

The scrambling matrix is chosen to make any sequence of constant-stride addresses spanning a length  $s < 2^{29}$  generate a nearly uniform distribution in the logical frame offset, which in turn generates a nearly uniform distribution in the physical unit number field and the low-order bank bits of the unit offset. Appendix C.1 specifies the matrix and the inverse of the low-order 19-by-19 submatrix.

The distributor takes the concatenation of the logical unit number (field "logical unit" from the data map entry), the logical segment offset, and the logical frame offset as its logical address. The next step distributes this logical address to control physical locality of reference. The distributor transforms the logical address into a physical address consisting of an eight-bit physical unit number and a 29-bit physical unit offset.

The distribution bit (field "distribution\_enable") in the data map entry allows references to be spread over all p memories in the system, rather than staying within one memory unit. Here, p is

set via the scan system and usually matches the number of processors, making p a power of two. However, in the presence of a faulty memory, p may also be a power of two less one. Distribution divides the logical unit address by p (or p+1 with a faulty memory) to effectively right shift the low-order bits of the logical unit number into the high-order bit positions of the logical unit offset and replace them with the low-order bits from the logical frame offset. Thus, the remainder becomes the new unit number and the quotient the unit offset. With a faulty memory, hardware mapping will allow distribution to bypass a faulty resource. This mapping may be set differently for each resource class, so that a system can run with any one faulty normal data memory resource and any one faulty expanded data memory resource. Even when distribution is disabled, this mapping will be in effect, so that the logical unit space appears contiguous.

This scheme allows distribution across physical memory units under control of the data map entry. A data map entry with distribution enabled will address all usable physical memory units in the system, implying that only  $2^{29}/p$  words are addressed in each unit as (word) addresses increase from 0 to  $2^{29} - 1$ . Moreover, the logical unit number u appearing in a data map entry is required to be less than p when distribution is enabled. If u is too large, a data address unimplemented exception is raised.

The low-order three bits of the original address are the byte address of the datum within the addressed word and are copied without modification into the low-order three bits of the final unit offset.

The Tera MTA computer supports machine subsetting. This feature allows any power of two subset of a machine to appear to software as an independent machine. For example, a 16-processor machine could be split into two eight-processor machines. In such a case, the interconnection network need not be split, but can be shared. To support subsetting, a physical unit base register is set up to convert unit numbers from the 0 to p-1 range to the appropriate range in the actual machine.

The resulting address is then sent to the network for routing to a memory unit. The physical unit number and the field "memory\_type" are used to construct the network address, which controls network routing. The memory type should be selected from the values in the following enumeration:

| Value | Behavior             |                                             |
|-------|----------------------|---------------------------------------------|
|       |                      |                                             |
| 0     | Expanded data memory |                                             |
| 1     | •                    |                                             |
| 2     | I/O processor        |                                             |
|       | 0                    | 0 Expanded data memory 1 Normal data memory |

The hardware supports an option which combines the normal and expanded data memory for global distribution. When that option is enabled, RES\_DMEM selects the bottom 1 gigabyte per processor of the global memory pool and RES\_IOM selects the top 1 gigabyte.

If the physical address is unimplemented, then a data protection exception is raised when the issuing operation completes. If the memory system detects an uncorrectable error such as a double-bit ECC error on the data loaded from memory, then a data hardware error exception is raised when the issuing operation completes. To help detect hot spots, successful loads which take an excessive amount of time to travel from the processor to the memory will raise a latency limit exception while performing the load. Synchronizing loads which retry are not subject to the limit until they succeed. This limit is set during system initialization. This implementation checks the limit with

Resource

Data Memory 31

a granularity of 16 cycles. The compares are performed modulo 4096 cycles, so that a latency of 4112 would appear as a latency of 16.

### 6.3 M-unit Internal State

The M-unit processes memory requests that are generated by M-operations. The M-unit may have up to eight requests simultaneously pending for each stream. The M-unit completes each request asynchronously.

For each stream, the state of any failed M-operations is held in eight pairs of registers called the data control registers and data value registers. A trap handler can save these register pairs using the DATA\_OPA\_SAVE and DATA\_OPD\_SAVE operations and can later use them to retry the operation with DATA\_OP\_REDO. The values in these registers are now described in more detail.

The eight data control registers contain address and control information for up to eight memory reference operations in progress in the M-unit, due to lookahead. The M-unit writes one of these registers when a memory operation is initiated, and reads it as it (re)tries the reference. When no operations are in progress, the program may read them directly using the DATA\_OPA\_SAVE operation. Each of the data control registers contains a data control descriptor with the structure shown here.

| Bits      | Wd    | Field Name    | Туре             | Description                    |
|-----------|-------|---------------|------------------|--------------------------------|
| DataContr | olDes | scriptor      |                  |                                |
| 63        | 1     | 0             |                  | reserved                       |
| 62        | 1     | fwd_disable   | Boolean          | forwarding disable             |
| 61-60     | 2     | fe_control    | FullEmptyControl | full/empty control             |
| 59        | 1     | trap1_disable | Boolean          | data trap 1 disable            |
| 58        | 1     | trap0_disable | Boolean          | data trap 0 disable            |
| 57-53     | 5     | dest_reg      | Reg              | destination or source register |
| 52-48     | 5     | restop        | RetryOpCode      | rest of the operation code     |
| 47-0      | 48    | address       | DataAddrUns      | byte address                   |

The value in the field "restop" encodes the operation that failed and raised an exception. The high-order bit of the field "restop" is set if the operation was a load (more precisely, an operation that writes a register upon completion) and is cleared if the operation was a store. The RetryOpCode enumeration is shown here.

| Name                | Value | Meaning     |  |
|---------------------|-------|-------------|--|
| RetryOpCode: Stores |       |             |  |
| OPA_STOREB          | 0     | STOREB      |  |
| OPA_STOREQ          | 1     | STOREQ      |  |
| OPA_STOREH          | 2     | STOREH      |  |
| OPA_STORE           | 3     | STORE       |  |
| OPA_STATE_STORE     | 7     | STATE_STORE |  |
| OPA_STORE_EMPTY     | 11    | REG_STORE   |  |
|                     |       |             |  |

| RetryOpCode: Loads          |    |                    |
|-----------------------------|----|--------------------|
| OPAINTLOADB                 | 16 | INTLOADB           |
| OPALINTLOADQ                | 17 | INTLOADQ           |
| OPA_INT_LOADH               | 18 | INTLOADH           |
| OPALNT_FETCH_ADD            | 19 | INT_FETCH_ADD      |
| OPA_UNS_LOADB               | 20 | UNS_LOADB          |
| OPA_UNS_LOADQ               | 21 | UNSLOADQ           |
| OPA_UNS_LOADH               | 22 | UNS_LOADH          |
| OPALLOAD                    | 23 | LOAD               |
| OPA_STATE_LOAD              | 24 | STATE_LOAD         |
| OPA_STATE_LOCK              | 25 | STATE_LOCK         |
| OPA_PROBE                   | 26 | PROBE              |
| OPA_REG_LOAD                | 27 | REG_LOAD           |
| OPA_SCRUB_LOAD              | 29 | SCRUB_LOAD         |
| RetryOpCode: Internal Codes |    |                    |
| OPA_STREAM_CREATE           | 4  | STREAM_CREATE      |
| OPA_MAP_FLUSH               | 5  | DATA_MAP_FLUSH     |
| OPA_MAP_FLUSH_ANY           | 6  | DATA_MAP_FLUSH_ANY |
| OPA_DATA_STATE              | 9  | DATA_STATE_RESTORE |
| RESTORE                     |    |                    |
| OPA_STREAM_CATCH            | 12 | STREAM_CATCH       |
| OPA_DATA_OPD_SAVE           | 14 | DATA_OPD_SAVE      |
| OPA_DATA_OPA_SAVE           | 15 | DATA_OPA_SAVE      |

The eight data value registers contain the data (if any) that the M-unit was attempting to write using the memory operation in the corresponding data control register. These registers are explicitly read by the program via the DATA\_OPD\_SAVE operation.

## 6.4 Speculative Loads

In speculative load mode, some data memory exceptions are deferred until the loaded value is used. In the usual circumstance, exceptional values are never used because the program (correctly) fails to use the prefetched data. Speculative loads allow data prefetching in iterative or recursive computations with data-dependent exit conditions.

When speculative loads are enabled (field "spec\_load\_enable" in SSW), a load with access control FE\_NORMAL into register r that would otherwise raise a data alignment exception, a data segment limit exception, a data map limit exception, or a data protection level exception will instead place a data control descriptor (§6.3) in r and set the corresponding poison flag in the exception register(§9.1). Note that the field "dest\_reg" of the data control descriptor is redundant. All other exceptions, including a data memory retry exception, are raised whether speculative loads are enabled or not. Whenever r is used as a destination register (even by a successful load, speculative or not) its poison flag is cleared. Use of a poisoned register r as a source operand raises a poison exception, except in REG\_STORE, REG\_MOVE, SELECT, and TRAP\_RESTORE operations.

# Chapter 7: Program Memory

A processor accesses instructions held in a program memory region of the data memory local to the processor. The term "program memory" is used to refer collectively to this region in data memory. This chapter describes what is stored in program memory, the semantics of accessing it, the address translation mechanism, and the instruction cache. Since it is part of data memory, each cell of program memory contains a four-bit access state and a word value, but this access state is ignored by the instruction fetching process. The value is a 64-bit instruction specifying up to three operations, one for each of the M-, A-, and C-units.

### 7.1 Program Memory Address Translation

Program addresses are 32 bits wide and are found in field "pc" occupying the low-order 32 bits of the stream status word and in the target registers. Only 25 bits of the program address space are implemented; the most significant seven bits must be zero. Thus the physical address space of the implementation allows for up to four gigawords of physical program memory but only 32 megawords are currently implemented. Program addresses are word rather than byte addresses and always address the data memory unit attached to the processor. A program address has the structure shown here:

| Bits      | Wd    | Field Name Type             | Description         |   |
|-----------|-------|-----------------------------|---------------------|---|
| Program A | ddres | \$                          |                     | _ |
| 31-12     | 20    | prog_page_number PageNumber | program page number |   |
| 11-0      | 12    | prog_page_offset Uns        | program page offset |   |

Program memory address translation for all streams (regardless of level) uses the translation logic shown in Figure 7.1. Address translation proceeds in four steps. The first logical step is to limit the program page number and select the program map to use for translation. Second, the appropriate program page map for this page is accessed, yielding a program page map entry which is concatenated with the program page offset to form a logical unit offset. Finally, the logical unit offset is scrambled, resulting in a physical unit offset.

If the protection domain's program map limit (found in the protection domain's program state descriptor) is smaller than the program page number, then a program protection exception is raised. The selected map base is added to the page number to yield a unit offset into local program memory, where the page map entry is found.

The program map entry from the program page map has the structure shown below.



FIGURE 7.1: Program Mapping Logic Block Diagram

| Bits     | 11   | d Field Name | Type      | Description   |  |
|----------|------|--------------|-----------|---------------|--|
| Program. | MapE | Entry        |           | ·             |  |
| 31-2     | 9 3  | 0            |           | reserved      |  |
| 28-1     | 2 17 | prog_frame   | ProgFrame | frame number  |  |
| 11-2     | 10   | 0            |           | reserved      |  |
| 1-0      | 2    | exec_level . | Level     | execute level |  |

The implementation restricts the field "prog\_frame" to values from 0 to  $2^{14}-2$ . To reduce instruction fetch latency, a program map cache saves recently used map entries. This cache is kept coherent with program memory using explicit PROGRAM\_MAP\_FLUSH instructions. The desired entries to flush are specified with a domain program address, which combines the domain and program address to index the program map. This implementation provides a 128-entry program map cache backed by the L2 instruction cache (i.e., the L2 instruction cache holds program map entries as well as instructions). The program map cache is not associative; however, a small fully associative victim cache provides some tolerance for contention. To reduce contention between domains, the domain number times eight is exclusive-or'ed with the set index before addressing the TLB.

| Bits     | Wd    | Field Name | Type                   | Description                                    |
|----------|-------|------------|------------------------|------------------------------------------------|
| DomainPr | ogran | nAddress   |                        |                                                |
| 63–60    | 4     | domain     | Uns                    | the domain to which this data address pertains |
| 60-33    | 28    | 0000000    |                        | reserved                                       |
| 31-0     | 32    | address    | ${\tt ProgramAddrUns}$ | program memory address                         |

When accessing the program map cache, the program address is equated to the ProgTlbAddr structure below. To flush the mappings for a domain from the TLB, the flush addresses must sequence through all possible values of the field "set\_number".

| Bits        | Wd  | Field Name   | Type        | Description            |
|-------------|-----|--------------|-------------|------------------------|
| Prog TlbA a | ldr |              |             |                        |
| 31-25       | 7   | 00           |             | reserved               |
| 24-19       | 6   | tag          | Uns         | program TLB tag        |
| 18-12       | 7   | set_number   | Uns         | program TLB set number |
| 11–0        | 12  | frame_offset | ${\tt Uns}$ | untranslated bits      |

Since the TLB is backed by the L2 instruction cache, map entries must be flushed from L2 as well. The structure below is used for addressing map entries in the L2 cache. The PROGRAM MAP\_FLUSH instruction automatically forwards a flush to the L2 cache, so that the sub-block of 64 map entries containing the referenced entry is flushed. The PROGRAM MAP\_FLUSH\_ANY operation flushes the whole line containing the referenced entry. As before, flush addresses must sequence through all possible values of the field "set\_number" in L2 as well.

| Bit        | s Wd                         | Field Name                                                       | Type                     | Description                                                                  |
|------------|------------------------------|------------------------------------------------------------------|--------------------------|------------------------------------------------------------------------------|
| 24-<br>20- | 25 7<br>21 4<br>18 3<br>12 6 | oo<br>set_number<br>line_index<br>subblock_index<br>frame_offset | Uns<br>Uns<br>Uns<br>Uns | reserved set number for L2 cache line index subblock index untranslated bits |

When a stream attempts to issue an instruction, if the privilege level of the stream is not equal to the execution protection level field "exec\_level" in the corresponding program page map entry, then a program protection exception is raised.

The program map yields a 17-bit frame number from field "prog\_frame" which is concatenated to the low-order 12 bits of the program counter, forming a 29-bit logical unit offset.

The bits of this offset are scrambled exactly as for the data logical unit offset.

The resulting 29-bit logical unit offset is also the physical offset: there is no distributor in the program address translator. The logical unit offset is then sent to the attached data memory unit. If the address is unimplemented, then a program protection exception is raised when an attempt is made to issue an instruction which could not be fetched due to this exception. If the memory system detects an uncorrectable error (such as a double-bit ECC error on the data being retrieved from memory), then an uncorrectable program memory exception is raised.

Note that identical data and program logical offsets in the same data memory will address identical locations as long as the segment's data map entry field "distribution\_enable" is clear.

# 7.2 The Instruction Cache

There is a primary and secondary instruction cache for each processor to reduce the required program memory bandwidth and improve latency. The caches are non-blocking, so that other streams may access the caches while a miss is being handled.

The primary cache (L1) holds 1024 instructions, organized in lines of four words. The 256 lines are two-way associative, so there are 128 sets. Lines are replaced using a least-recently-used policy. Both the primary and secondary caches are tagged with physical addresses. Physical addresses are mapped to the L1 cache as shown below.

| Bits                                     | Wd | Field Name                            | Type              | Description                                             |  |
|------------------------------------------|----|---------------------------------------|-------------------|---------------------------------------------------------|--|
| L1Address<br>31-26<br>25-9<br>8-2<br>1-0 | 6  | 00<br>tag<br>set_number<br>line_index | Uns<br>Uns<br>Uns | reserved  physical L1 tag  L1 set number  L1 line index |  |

The secondary cache (L2) holds one quarter million words of instruction and program map data. To reduce the number of tags, the data is organized into lines of 256 words, with eight sub-lines of 32 words each. The 1024 lines are four-way associative, so there are 256 sets. Here, lines are replaced

using a random policy. Note that 16 lines can contain a program frame. Physical addresses are mapped to the L2 cache as shown below.

| Bits      | Wd | Field Name        | Type | Description       |
|-----------|----|-------------------|------|-------------------|
| L2Address |    |                   |      |                   |
| 31-26     | 6  | 00                |      | reserved          |
| 25-16     | 10 | tag               | Uns  | physical L2 tag   |
| 15-8      | 8  | set_number        | Uns  | L2 set number     |
| 7-5       | 3  | line_index        | Uns  | L2 line index     |
| 4-0       | 5  | $subblock\_index$ | Uns  | L2 subblock index |

The PROGRAM\_CACHE\_FLUSH operations are provided so that the operating system can maintain cache coherence when instructions in program memory are changed. These operations allow program frames to be flushed from the caches, so that subsequent accesses will fetch correct data from program memory.

To flush all entries from a single frame from the L1 and L2 caches, each entry must be flushed with PROGRAM\_CACHE\_FLUSH. The address must sequence through all values of the L1 set number, and all values of the L2 set number within that page, amounting to 128 flushes. To flush all entries from the L1 and L2 caches, each entry must be flushed with PROGRAM\_CACHE\_FLUSH\_ANY. The address must sequence through all values of the L1 set number, and all values of the L2 set number, comprising a total of 256 flushes. PROGRAM\_CACHE\_FLUSH\_L1 has the same effect on the L1 cache as PROGRAM\_CACHE\_FLUSH\_ANY, without affecting the L2 cache.

# Chapter 8: Levels and Protection Domains

### 8.1 Levels

A stream can execute at one of four privilege levels: LEV\_USER. LEV\_SUPER (supervisor), LEV\_KERNEL, and LEV\_IPL (initial program load). Lower levels have fewer privileges. The privilege levels are defined here:

| Name       | Value | Meaning                    |  |  |
|------------|-------|----------------------------|--|--|
| Level      |       |                            |  |  |
| LEV_USER   | 0 .   | user level                 |  |  |
| LEV_SUPER  | 1     | supervisor level           |  |  |
| LEV_KERNEL | 2     | kernel level               |  |  |
| LEV_IPL    | 3     | initial program load level |  |  |

User, supervisor, kernel, and IPL level streams are constrained in addressability by the program and data maps. The data map entries define the minimum privilege levels needed to read and to write each segment, and the program map entries define the exact privilege level needed to execute from each page.

The LEVEL\_ENTER and LEVEL\_RTN operations change stream privilege levels. A LEVEL\_ENTER must be the first operation executed at an entry point when the caller is from a different privilege level. LEVEL\_RTN restores the original privilege level. The current privilege level is expressly not directly readable by a stream (although it can be inferred from the program map) to simplify the virtualization of privilege levels.

The domain signal exception is set when the privilege level of the issuing stream is less than the domain signal level in the program state. The domain signal level is increased by the operating system when it finds it necessary to communicate with all streams in its domain, e.g. to prepare for a swap.

# 8.2 Protection Domains

A processor supports 16 protection domains, each of which implements an address space. Each domain has several registers holding stream resource limits and accounting information. By convention, one of the protection domains is reserved for operating system daemons.

A stream runs in exactly one protection domain, denoted by D. When one stream activates another using the STREAM\_CREATE operation, the new stream executes in the same protection domain as its creator and therefore inherits all of its creator's job-context. A stream's protection domain D is read by the privileged DOMAIN\_IDENTIFIER\_SAVE operation and written by the privileged DOMAIN\_LEAVE operation.

Level

Each protection domain has counters controlling stream resource allocation, a data state descriptor and a program state descriptor describing the data and program address spaces. and eight performance counters.

## 8.3 Stream Resource Control

The seven-bit counter SRESD contains the total number of streams reserved in the protection domain by STREAM\_RESERVE operations. The seven-bit counter SCURD maintains a count of the actual number of streams in use. It is constrained by the hardware to be less than or equal to SRESD, is incremented by the STREAM\_CREATE operation, and is decremented by the STREAM\_QUIT operation. These counters may also be read by the STREAM\_CUR\_SAVE and STREAM\_RES\_SAVE operations.

The seven-bit counter  $SLIM_D$ , found in the program state descriptor, contains the maximum number of streams reservable by this protection domain.  $SLIM_D$  is an upper bound on  $SRES_D$ , the streams currently reserved the protection domain. The operating system sets  $SLIM_D$  to prevent the protection domain from monopolizing the available streams.

SLIMD can actually be set below the current value of SRESD; because STREAM\_QUIT actually decrements SRESD as well as SCURD, both will be coerced lower as streams terminate.

# 8.4 Data State Descriptor

There is one data state descriptor per protection domain specifying how the data memory operations are interpreted. It is written using the DATA\_STATE\_RESTORE operation. The descriptor is shown here. Note that the field "retry\_limit" is multiplied by four in use. This factor allows retry limits up to 1024. As with memory addresses, the data map limit must have zeros for the six most significant bits in the current implementation.

| Bits      | Wo                  | l Field Name        | Type        | Description                                                                                                                                                                                               |  |  |  |  |  |  |
|-----------|---------------------|---------------------|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| DataState | DataStateDescriptor |                     |             |                                                                                                                                                                                                           |  |  |  |  |  |  |
| 63-60     |                     | domain              | Uns         | the domain to which this descriptor pertains                                                                                                                                                              |  |  |  |  |  |  |
| 59-58     |                     | min_dkill           | Level       | data minimum level not killed; if a memory operation is selected for issue, and its stream's privilege level is below field "min_dkill", then it fails with result DR_UNIMPLEMENTED_OP, raising data_prot |  |  |  |  |  |  |
| 57–48     |                     | 0                   |             | reserved                                                                                                                                                                                                  |  |  |  |  |  |  |
| 47-28     | 3 20                | limit               | DataSegment | data map limit: the largest data seg-<br>ment number available to the domain;<br>see §6.2                                                                                                                 |  |  |  |  |  |  |
| 27-20     | 8                   | retry <u>l</u> imit | Uns         | data memory retry limit: bounds the<br>number of times that a memory oper-<br>ation can be retried before failing and<br>raising the data memory retry excep-<br>tion; see §6.1                           |  |  |  |  |  |  |
| 19        | 1                   | 0                   |             | reserved                                                                                                                                                                                                  |  |  |  |  |  |  |
| 18-0      | 19                  | base                | DataFrame   | data map base, added to data segment<br>numbers to yield an offset into local<br>data memory; see §6.2                                                                                                    |  |  |  |  |  |  |

# 8.5 Program State Descriptor

There is one program state descriptor per protection domain. It contains several kinds of information relating to instruction interpretation within the domain and is written using the PROGRAM\_STATE\_RESTORE operation. This descriptor is defined below:

| Bi     | ts   | Wd    | Field Name | Type  | Description                                                                                                                                                                                                                                                                                                                                 |
|--------|------|-------|------------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Ртодга | mSta | iteDε | scriptor   |       |                                                                                                                                                                                                                                                                                                                                             |
| 63     | -60  | 4     | domain     | Uns   | the domain to which this descriptor pertains                                                                                                                                                                                                                                                                                                |
| 59     | -58  | 2     | min_pkill  | Level | program minimum level not killed: if<br>the privilege level of an issued stream<br>is less than the value in field "min-<br>pkill", then the stream branches to a<br>virtual address set by the scan system,<br>presumably to execute a STREAM-<br>QUIT; after branching, the stream has<br>all traps masked and cannot be pkill'd<br>again |
| 57-    | -56  | 2     | min_psleep | Level | program minimum level not sleeping; if the privilege level of an issued stream is less than the value in field "min_psleep", then all side effects of the instruction are suppressed including counter increments; if both min_pkill and min_psleep are set, psleep has precedence over pkill                                               |

| ProgramSt          | ate D | escriptor |         |                                                                                                                                                                                        |
|--------------------|-------|-----------|---------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 55                 | 1     | 0         |         | reserved                                                                                                                                                                               |
| 54                 | 1     | priv_t0   | Boolean | writing target register T0 is privileged,<br>so that unprivileged streams may not<br>freely change the address of the trap<br>handler; see §9.2                                        |
| 53                 | 1     | priv_quit | Boolean | STREAM_QUIT operations become privileged, to provide an opportunity to clear the stream's registers before releasing it to the hardware for reallocation; see §2                       |
| 52-50              | 3     | 0         |         | reserved                                                                                                                                                                               |
| 4 <del>9-4</del> 8 | 2     | allsig    | Level   | minimum level not signaled; if a stream is selected for issue and its privilege level is less than the value in field "allsig", then the stream will raise the domain signal exception |

| ProgramSi | tateL | Descriptor |            |                                                                                                                                     |
|-----------|-------|------------|------------|-------------------------------------------------------------------------------------------------------------------------------------|
| 47        | 1     | 0          |            |                                                                                                                                     |
| 46–40     | 7     | slim       | Uns        | reserved  stream limit, SLIMD, which limits the maximum number of streams that may be reserved for this protection domain. see §8.3 |
| 39-38     | 2     | 0          |            | · · · · · · · · · · · · · · · · · · ·                                                                                               |
| 37-18     | 20    | limit      | PageNumber | reserved  program map limit, the largest program page number available to the domain: see §7.1                                      |
| 17        | 1     | 0          |            | reserved                                                                                                                            |
| 16–0      | 17    | base       | ProgFrame  | program map base, added to program page numbers to yield an offset into local data memory; see §7.1                                 |

# Chapter 9: Exceptions and Traps

An exception is an unexpected condition raised by an event in the user program, the operating system, or the hardware. The exception register summarizes exceptional conditions and the result code register describes floating-point and memory exceptions more fully.

Exceptions can cause a trap to be triggered the next time the stream is ready for execution. However, a set exception flag will not trigger a trap if that trap has been disabled by one of the trap-disable bits in the trap mask of the ssw. If an exception is raised while its trap is disabled and the trap is later enabled, the trap will be taken then. Once raised, an exception flag remains set until explicitly cleared by software.

Multiple exceptions can occur simultaneously. For example, if a stream uses lookahead to issue two concurrent loads, the two loads can finish together (between instructions of the issuing stream). Suppose one load raises a data trap 0 exception and the other load raises a data trap 1 exception. It is up to the program (usually a trap handler) to decide the order in which such exceptions are processed.

# 9.1 Exceptions

The exception register is manipulated using the privileged EXCEPTION\_SAVE and EXCEPTION\_RESTORE operations. An enumeration for the exceptions is shown below.

| Name                           | Value | Meaning                                       |  |
|--------------------------------|-------|-----------------------------------------------|--|
| Exception: Hardware Exceptions |       |                                               |  |
| Ex_Data_HW_Error               | 57    | data memory error or network hard-            |  |
| Ex_Prog_HW_Error               | 56    | ware error uncorrectable program memory error |  |
| Exception: System Exceptions   |       |                                               |  |
| Ex_Instruction_Count           | 52    | instruction count became 0                    |  |
| Ex_Data_Prot                   | 51    | data protection                               |  |
| Ex_Prog_Prot                   | 50    | program protection                            |  |
| Ex_Poison                      | 49    | use of a poisoned register                    |  |
| Exception: Signal Exceptions   |       |                                               |  |
| Ex_Domain_Signal               | 48    | domain signal                                 |  |

| Exception: User Exceptions       |      |                                          |  |
|----------------------------------|------|------------------------------------------|--|
| Ex_Create                        | 4-1  | stream create exception                  |  |
| Ex_Privileged                    | 43   | privileged operation                     |  |
| Ex_Data_Alignment                | 42   | unaligned data exception                 |  |
| Ex_Data_Blocked                  | 41   | data memory retry exception or data trap |  |
| Ex_Float_Extension               | 40   | float software extension exception       |  |
| Exception: Floating-point Except | ions |                                          |  |
| Ex_Float_Invalid                 | 36   | float invalid operation                  |  |
| Ex_Float_Zero_Divide             | 35   | float zero divide                        |  |
| Ex_Float_Overflow                | 34   | float overflow                           |  |
| Ex_Float_Underflow               | 33   | float underflow                          |  |
| Ex_Float_Inexact                 | 32   | float inexact                            |  |

Of these exceptions, there are four that are raised by an instruction before it can execute. They are prog\_hw\_error, prog\_prot, poison, and privileged. If one these exceptions is raised while it is masked, the stream will hang. If such an exception is raised and is unmasked, the stream will trap with the trap pc (in T0) pointing to the instruction that caused the exception. If possible, the trap handler could apply some "antidote" and then try to execute the instruction again by returning to T0.

Two exceptions are not raised by the instruction at all: domain\_signal and instruction\_count. Generally, the trap handler can service the event and return to the program address in T0 to continue. When these exceptions are masked, they are simply ignored.

Most exceptions are caused by the previous instruction. Since that instruction may have contained a jump, its address is lost. Exceptions in this class include create and all the floating-point exceptions.

Presumably, a failing STREAM\_CREATE could be handled by reserving a stream and then retrying the create. The easiest way to retry the create is to decrement T0 and jump to it. In this case, there could be no jump in the previous operation since a create is a MAC-operation. A masked create exception will be ignored, although no stream will have been created.

On a floating-point exception, the handler may want to examine the source registers, the operation, and the value written to the destination. Sometimes, it will want to place a new value in the destination based on the above information as well as on some global state. Masked floating-point exceptions are simply ignored.

Finally, there are the data exceptions, which may be caused by any of the last eight instructions executed. These are data\_hw\_error, data\_prot, data\_alignment, and data\_blocked. To handle these exceptions, the trap routine should check the data result codes and the corresponding DATA\_OPA and DATA\_OPD state.

The exception register is shown below. There is one bit in the upper half of the exception register for each member of the Exception enumeration shown above. Each of these exception bits can cause a trap, but some ssw bit disables that trap; the left-hand column tells which one.

The lower half of the exception register contains the poison flags. A speculative load operation that would otherwise raise an exception as described in §6.4 instead sets the poison flag corresponding to its destination register. These flags do not trigger traps directly, as confirmed by the "no trap"

annotation. Should an instruction use a poisoned register as a source operand, the poison exception will be raised and a trap may then occur. The poisoned register contains a data control descriptor (§6.3), permitting re-execution or diagnosis of the failed load operation by the exception handler.

| _    | Trap    | Disab             | le Bit  | Bits            | Wd     | Field Name       | Type   | Description                                                                                                               |
|------|---------|-------------------|---------|-----------------|--------|------------------|--------|---------------------------------------------------------------------------------------------------------------------------|
| Ezce | eption  | Reaist            | er: Ho  | ırdware .       | Excen  | ntions           |        |                                                                                                                           |
|      |         | <b>y</b>          |         | 63-58           | •      | 00               |        | reserved                                                                                                                  |
|      | hardv   | vare              |         | 57              | 1      | data_hw_erro     | r Flag | data memory error or network hard-<br>ware error; see §6.2                                                                |
|      | hardv   | vare              |         | 56              | 1      | prog_hw_erro     | r Flag | uncorrectable program memory error; see §7.1                                                                              |
| Exce | eption. | Registe           | er: Sy. | stem Exc        | ceptio | ns               |        |                                                                                                                           |
|      |         | •                 | -53 3   | 0               |        |                  |        | reserved                                                                                                                  |
|      | syster  | n 52              | 1       | inst            | ructio | n_count Flag     |        | instruction count became 0; see §10                                                                                       |
|      | syster  | n 51              | 1       | data            | _prot  | Flag             |        | data protection level, map limit, seg-<br>ment limit exceeded, unimplemented<br>op, or unimplemented address; see<br>§6.2 |
|      | syster  | n 50              | 1       | prog            | _prot  | Flag             |        | program protection level, limit viola-<br>tion, or unimplemented address; see<br>§7.1                                     |
| :    | systen  | n 49              | 1       | poiso           | n      | Flag             |        | use of a poisoned register; see §6.4                                                                                      |
|      | -       | Registe<br>n sign | _       | nal Exce<br>3 1 | -      | s<br>main_signal | Flag   | domain signal: set when the stream level is less than the domain signal level; see §8.1                                   |
| Exce | ntionl  | Reniste           | r. Ilsa | r Excep         | tions  |                  |        |                                                                                                                           |
| 2000 | ,,,,,,, | 47-45             |         | -               |        |                  |        | reserved                                                                                                                  |
| 1    | user    | 44                | 1       | create          |        | Flag             |        | stream create exception: attempt to create more streams than are reserved; see the STREAM_CREATE operation                |
| ı    | user    | 43                | 1       | privileg        | ed     | Flag             |        | unimplemented or privileged opera-<br>tion; see §3                                                                        |
| ι    | ıser    | 42                | 1       | data_ali        | gnme   | nt Flag          |        | unaligned data exception; see §6.1                                                                                        |
| t    | ıser    | 41                | 1       | data_blo        | _      | Flag             |        | data memory retry exception, latency limit exception, or data trap 0 or 1; see §6.1                                       |
| υ    | iser    | 40                | 1       | float_ext       | tensio | n Flag           |        | float software extension; see §5.3                                                                                        |
| 9    | .1 Exc  | eption            | ıs      |                 |        | •                |        | Exception Register                                                                                                        |

| Exception Regis | ster: Fl | oating- | -point | Exceptions        |                           |                          |
|-----------------|----------|---------|--------|-------------------|---------------------------|--------------------------|
|                 |          | 39-3    | 7 3    | 0                 |                           | reserved                 |
| float invalid   |          | 36      | 1      | float_invalid     | Flag                      | float invalid operation  |
| float zero      | divide   | 35      | 1      | float_zero_divide | Flag                      | float zero divide        |
| float overf     | low      | 34      | 1      | float_overflow    | Flag                      | float overflow           |
| float underflow |          | 33      | 1      | float_underflow   | Flag                      | float underflow          |
| float inexact   |          | 32      | 1      | float_inexact     | Flag                      | float inexact            |
| ExceptionRegis  | iter: Po | ison F  | lags   |                   |                           |                          |
| (no trap) 31    |          | 1       | pf31   | Flag              | poison flag <sub>31</sub> |                          |
| •••             | •••      | • • •   |        |                   |                           |                          |
| (no trap)       | 1        | 1       | pſl    | Flag              |                           | poison flag <sub>1</sub> |
|                 | 0        | 1       | 0      |                   |                           | reserved                 |

The result code register contains a more detailed description of the results of memory and floating-point operations. The structure of the result code register is described here.

| Bits | Wd Field Name | Type | Description |  |
|------|---------------|------|-------------|--|
|      |               |      |             |  |

### Result Code

|      | <b>63</b> –56<br><b>55–</b> 51 | 8<br>5 | onit Float Results  O A_float_result_reg A_float_result_code | •               | reserved previous A-unit result register A-unit result code |
|------|--------------------------------|--------|--------------------------------------------------------------|-----------------|-------------------------------------------------------------|
|      |                                |        | ınit Float Results                                           |                 |                                                             |
|      | 47-40                          | •      | O fant result res                                            | Dag             | reserved                                                    |
|      |                                |        | C_float_result_reg                                           | •               | previous C-unit result register                             |
|      | 34–32                          | 3      | C_float_result code                                          | FloatResultCode | C-unit result code                                          |
| Resi | ılt <i>Co</i> de.              | : M-1  | unit Data Results                                            |                 |                                                             |
|      | 31-28                          | 4      | dr7                                                          | DataResultCode  | data result <sub>7</sub>                                    |

DataResultCode

The FloatResultCode that is stored in field "A float\_result\_code" or field "C\_float\_result\_code" is described below. When no exception or only float\_inexact is raised, the result code is set to field "FR\_FG". When float\_invalid, float\_zero\_divide, float\_overflow, or float\_underflow is raised, the result code is set to field "FR\_FX". All other result codes are coupled with the float\_extension exception. The result register field is written whether or not the result code is nonzero.

data resulto

### FloatResultCode

dr0

3-0

| Name            | Value | Meaning                                       |
|-----------------|-------|-----------------------------------------------|
| FloatResultCode |       |                                               |
| FR_FG           | 0     | float good                                    |
| FR_IM           | 3     | operand to integer multiply is too large      |
| FR_FX           | 4     | float is exceptional                          |
| FR_DZ           | 5     | divide by zero                                |
| FR_DR           | 6     | denormalized operand to FLOAT<br>RECIP_APPROX |
| FR.DQ           | 7     | denormalized operand to FLOAT<br>RSQRT_APPROX |

Each of the four-bit data result fields in the low-order part of the result code register is written when an M-operation completes and contains one of the values shown below.

| Name                        | Value | Meaning                                                                      |
|-----------------------------|-------|------------------------------------------------------------------------------|
| Data Result Code            |       |                                                                              |
| DR_NONE                     | 0     | the operation completed successfully                                         |
| DR_DATA_TRAP0               | 1     | data trap 0 exception                                                        |
| DR_DATA_TRAP1               | 2     | data trap 1 exception                                                        |
| DR_DATA_TRAP01              | 3     | both data trap 0 and data trap 1 exception                                   |
| DR_RETRY_LIMIT              | 4     | data memory retry exception                                                  |
| DR_LATENCY_LIMIT            | 5     | data memory latency exception                                                |
| DR_DATA_ALIGNMENT           | 6     | data alignment exception                                                     |
| DR_UNIMPLEMENTED_OP         | 7     | unimplemented operation by DATA<br>OP_REDO, or aborted by dkill; see<br>§8.4 |
| DR_MAP_LIMIT                | 8     | data map limit exception                                                     |
| DR_PROTECTION_LEVEL         | 9     | data protection level exception                                              |
| DR_SEGMENT_LIMIT            | 10    | data segment limit exception                                                 |
| DR_UNIMPLEMENTED<br>ADDRESS | 11    | data address unimplemented exception                                         |
| DR_UNCORRECTABLE<br>ERROR   | 12    | uncorrectable data memory exception                                          |

The data result fields in the exception register are normally read by the trap handler to diagnose failing memory operations. Data result code *i* corresponds to the data control descriptor and data retrieved by the DATA\_OPA\_SAVE or DATA\_OPD\_SAVE operations with opno *i*. To simplify trap handling with one or zero failing memory operations, the most recent failing memory operation is relabeled as opno 0.

# 9.2 Traps

A trap exchanges the SSW.pc with the contents of target register T0 and sets ssw\_override mode in SSW. A trap is taken when an exception is raised when its corresponding trap disable bit is clear or a trap disable bit is cleared when a corresponding exception bit is set: see §9.1. A stream that traps does not change its privilege. The trap is lightweight in the sense that only a small amount of state need be saved before control is transferred to a user-supplied exception handler. While the ssw\_override flag is set, all traps are masked, lookahead is disabled, and the instruction counter is disabled. In addition, all but synchronizing loads, stores, and int\_fetch\_adds will retry forever to prevent spurious retries from causing a nested exception. The trap handler should return to the main program using a LEVEL\_RTN with the appropriate level. This form of jump will clear ssw\_override mode, allowing the next instruction to use the true ssw mode and trap mask bits.

Operations that set T0 are supervisor-privileged if field "priv\_t0" is set in the program state descriptor of the protection domain; this option allows auditing of security-relevant events by a trusted (but not necessarily privileged) trap handler. However, restoration of the trap handler entry point when resuming execution of the interrupted activity must be done at privilege level LEV\_SUPER or higher if field "priv\_t0" is set.

Note that a stream can disable all traps, including system and hardware traps, although it may be unwise to do so. Disabling traps will not necessarily stall the processor or the stream. The operating system can easily regain control by raising field "min\_pkill" in the program state descriptor. If the stream's privilege level is less than field "min\_pkill", then the stream will branch to a fixed virtual address, generally containing a STREAM\_QUIT. If a stream were to encounter a prog\_prot, prog\_hw\_err, or privilege exception after branching in response to pkill. hardware diagnostic intervention is necessary to recover the stream.

There are eight trap registers available for the trap handler to use as temporary storage as it saves or restores processor state. Other uses are discouraged. The trap registers are manipulated by the TRAP\_SAVE and TRAP\_RESTORE operations. Due to hardware limitations, trap register sets are allocated and deallocated from streams on demand. That is, the first TRAP\_RESTORE a stream performs will allocate a trap register set. That set will be deallocated when the stream issues a TRAP\_SAVE of TRO. The current implementation provides 32 sets to serve the 128 streams. To protect the operating system, the last trap register set will not be allocated to a user level stream. If a stream tries to allocate a trap register set and fails, that issue is squashed.

Every taken trap counts as a CNT\_TRAP event.

# Chapter 10: Resource Counters

## 10.1 Instruction Counter

Each stream has a 16- bit unsigned instruction counter that is intended for debugger support. When an instruction issues, the instruction counter is decremented if the field "count\_disable" in the ssw is not set and the counter is not already zero. If the instruction counter becomes zero, then an instruction count exception is raised. The instruction counter is set by the STREAM\_COUNT\_INST\_RESTORE operation and is read by the STREAM\_COUNT\_INST operation.

## 10.2 Protection Domain Counters

Each protection domain maintains eight 64-bit resource counters. These counters are only updated every 256 cycles, which limits their resolution.

### instruction issue counter

The instruction issue counter increments when an instruction issues in the domain. This counter is read by the COUNTISSUES operation.

#### memory reference counter

The memory reference counter counts the number of memory LOAD, STORE, FETCH\_ADD, or STATE operations that are issued in the domain. This counter does not count memory retries or additional memory fetches required for forwarding. When divided by instruction issues, this counter provides an indication of the average number of memory references per instruction. This counter is read by the COUNT\_MEMREFS operation.

#### stream counter

The stream counter is incremented every 256 ticks by the contents of the protection domain's SRESD counter. When multiplied by 256 and divided by cycles, this counter provides an indication of the average stream usage of the domain. This counter is read by the COUNT\_STREAMS operation.

#### concurrency counter

The concurrency counter is incremented every 256 ticks by the number of memory operations in the protection domain that have issued but not yet completed. When multiplied by 256 and divided by cycles, this counter provides an indication of the average number of memory operations in progress. This counter is read by the COUNT\_CONCURRENCY operation.

#### selectable event counters

The four selectable event counters can be set to count any four of a sizable number of events. The events counted are selected by the event counter select register, which is set by the supervisor-privileged COUNT\_SELECT\_RESTORE operation and is read by the COUNT\_SELECT\_SAVE operation. The event counter select register has the structure shown here:

| Bits                                                | Wd                 | Field Name                                   | Type                                            | Description                                                                                              |
|-----------------------------------------------------|--------------------|----------------------------------------------|-------------------------------------------------|----------------------------------------------------------------------------------------------------------|
| EventSele<br>63-33<br>31-24<br>23-16<br>15-8<br>7-0 | 2 32<br>4 8<br>5 8 | 00000000<br>sel_0<br>sel_1<br>sel_2<br>sel_3 | CountSource CountSource CountSource CountSource | reserved tag for event counter 0 tag for event counter 1 tag for event counter 2 tag for event counter 3 |

The value of a selectable event counter is read by the COUNT\_EVENTS operation.

The CountSource tag can be one of the values shown below. Setting the tag to an undefined value has undefined results. Setting the tag to denote a dedicated counter has undefined results.

| Name                             | Value | Meaning                                           |
|----------------------------------|-------|---------------------------------------------------|
| CountSource: other operations    |       |                                                   |
| CNT_M_NOP                        | 0     | NOP operations executed by the M-                 |
| CNT_A_NOP                        | 1     | unit NOP operations executed by the A- unit       |
| CNT_C_NOP                        | 2     | NOP operations executed by the C-<br>unit         |
| CountSource: target registers    |       |                                                   |
| CNT_TARGET                       | 3     | TARGET set operations (not including TARGET_SAVE) |
| CountSource: data memory         |       | . ′                                               |
| CNTLOAD                          | 4     | LOAD operations issued                            |
| CNT_STORE                        | 5     | STORE operations issued                           |
| CNT_INT_FETCH_ADD                | 6     | INT_FETCH_ADD operations issued                   |
| CNT_MEM_RETRY                    | 7     | memory operations retried, including forwarding   |
| CountSource: floating operations |       | G                                                 |
| CNT_FLOAT_ADD                    | 8     | FLOAT_ADD and FLOAT_SUB operations                |
| CNT_FLOAT_MUL                    | 9     | FLOAT_ADD_MUL operations                          |
| CNT_FLOAT_DIV                    | 10    | FLOAT_DIV operations                              |
| CNT_FLOAT_SQRT                   | 11    | FLOAT_SQRT operations                             |
| CNT_FLOAT_TOTAL                  | 12    | total floating-point operations                   |
| CountSource: branches            |       | •                                                 |
| CNT_JUMP_EXPECTED                | 13    | expected JUMP or SKIP path taken                  |
| CNT_JUMP_UNEXPECTED              | 14    | unexpected JUMP or SKIP path                      |
| CNT_TRANSFER_TOTAL               | 15    | sum of all transfer operations                    |

Resource Counters 51

| CNTLEVEL                        | 16  | LEVEL_ENTER operations   |
|---------------------------------|-----|--------------------------|
| CountSource: traps CNT_TRAP     | 17  | traps taken              |
| CountSource: streams            |     |                          |
| CNT_CREATE                      | 18  | STREAM_CREATE operations |
| CNT_QUIT                        | 19  | STREAM_QUIT operations   |
| CountSource: dedicated counters |     |                          |
| CNTLISSUES                      | 128 |                          |
| CNT_MEMREFS                     | 129 |                          |
| CNT_STREAMS                     | 130 |                          |
| CNT_CONCURRENCY                 | 131 |                          |
|                                 |     |                          |

# 10.3 Processor Counters

Each processor maintains three 64-bit counters.

#### clock

The clock increments once every tick. It is initialized using the hardware scan mechanism during IPL. By convention, the clocks on all processors are synchronized: that is they agree in value. The clock is read by the CLOCK operation.

#### phantom counter

The phantom counter counts the number of instruction issue slots unused by its processor. Normally, an instruction will execute to completion once issued, so that true phantoms are the main source of unused slots. However, some exceptions, such as a poison exception, will cause the triggering instruction to be aborted, "wasting" an issue slot. These aborted instructions count as phantoms as well.

#### ready counter

The ready counter on each processor sums the total number of streams ready at each tick of its processor. However, a stream that issues as soon as it becomes ready will not contribute to this counter. Due to stream scheduling constraints, a ready stream may wait a few cycles before it issues, even on a processor with free issue slots.

# Chapter 11: Operation Descriptions

## 11.1 Notation

A TERA MTA assembly language program has the same syntactic form as a sequence of Lisp expressions. Each instruction has the format:

(INST lookahead M-operation A-operation C-operation)

If any of the M-, A-, or C-operations is not specified, then the assembler will fill the missing operation with a NOP. Operations should be specified in order when one of them can be executed by more than one functional unit, e.g. FLOAT\_ADD.

An operation belongs to exactly one group. The operations in a group have minor differences, such as in the way operands are addressed, whether the M-, A-, or C-unit does the operation, and so forth. Each page in this section describes an operation group. The following page describes a sample operation group.

$$u_{31} v_{42} v_{37} v_{22} 01_{27} 02_{21} \dots_{0}$$
 CLASS

pseudo code description of operation\_l {where variable \in set}

## (OPERATION\_2 operand-template)

$$u_1 v_2 v_3 v_4 v_5 v_5 v_5 v_6$$
 CLASS

pseudo code description of operation\_2 {where variable \in set}

The operations in this group (here, OPERATION\_1 and OPERATION\_2) are described in more detail in this section.

The CLASS describes the set of M, A, and C instruction fields used by this operation.

An identifier always refers to the same value within the description of an operation. The identifiers r, s, t, u, v, w, x, y, z always refer to the contents of a register in a fixed position within the 64-bit instruction. Subscripts on identifiers are bit subscripts. Unless otherwise stated, the range of a subscript is from 63 down to 0. Bit numbers increase from right to left. An identifier denoting an immediate constant is quoted, e.g. 'disp. The range of an immediate constant is always constrained by a where clause.

The clause " $\{where \ predicate\}$ " is a constraint. The predicate is a Boolean function that must be true. The most common constraints bound the range of immediate constants.

The pseudo code description uses a conventional Algol-like language containing flow statements, assignment statements, and expressions, built from operators and operands. Operators are described below. An operand may be a constant (interpreted in base 10); a quoted identifier, such as 'disp, denoting an immediate value: or a non-quoted identifier, such as r, denoting the contents of the register addressed by the value bound to that identifier.

The assembly code prototype for an operation is: "(OPERATION operand-template)". The encoding for each operation is described using fields. The fields used by an operation are listed from left to right within the word, from bit 63 to bit 0. Each field has two parts. The field end is the field's low-order bit number. The field fill can be an identifier, which stands for its value; a literal constant, which is represented in hexadecimal: or an ellipses representing "holes" in the object encoding. For example, the encoding for OPERATION\_1 denotes a field starting at bit 41 and ending at bit 37 containing the register number u, a field starting at bit 36 and ending at bit 32, containing the register number v, a field starting at bit 31 and ending at bit 27 containing the literal value "01<sub>16</sub>", and finally a field starting at bit 26 and ending at bit 21 containing the literal value "02<sub>16</sub>".

For the most part, the order of fields in the object code encoding is the same as the order of fields in assembly language. Deviations from this rule are noted explicitly. In any event, the mapping from assembly to machine code is manifest in the encodings.

#### **EXAMPLES**

An example of how the instruction might be used is given here.

#### RAISES

The exceptions that the operation may raise are enumerated here.

#### COUNTS AS

The event counters that are incremented by this operation.

# SEE ALSO

The names of related operations and section numbers are given here

# 11.2 Operation Naming Conventions

The mnemonics for operations are chosen according to these general rules.

- The mnemonics for most operations start with the name of the data type being manipulated. The principle exceptions are the STORE family, the SHIFT family, the JUMP family, and the miscellaneous operations dealing with program state. The data type prefixes are shown in Figure 11.1.
- Mnemonics ending with "LTEST" generate a condition code.
- Mnemonics containing "\_RESTORE" move value(s) from general purpose registers into special registers.
- Mnemonics containing "\_SAVE" move value(s) from special registers into general purpose registers.
- Mnemonics containing "IMM" contain a small immediate constant operand. Some of these operations accept one value in the assembly code, but place another value in the object code. For example, INT\_ADD\_IMM takes and adds a value from 1 to 32, but the value put in the object code ranges from 0 to 31; the hardware increments this value to produce the desired sum.
- Mnemonics containing "MAP" deal with the program or data map.

| data type      | what                                                        |
|----------------|-------------------------------------------------------------|
| BIT<br>BIT_MAT | a word of bits an 8 * 8 matrix of bits packed into a word   |
| COUNT          | a resource count                                            |
| DATA           | M-unit state                                                |
| DOMAIN         | a protection domain                                         |
| EXCEPTION      | an exception                                                |
| FLOAT          | a floating-point number                                     |
| INT            | a signed integer                                            |
| LEVEL          | a privilege level                                           |
| LOGICAL        | a 64-bit wide logical value: O(false), 1(true) or -1(true). |
| PTR            | a pointer to memory, with access control                    |
| STATE          | memory access state bits                                    |
| STREAM         | an instruction stream                                       |
| UNS            | an unsigned integer                                         |

FIGURE 11.1: Data Type Prefixes

- Mnemonics containing "AC" use access control from the operation when computing an address.
- Mnemonics containing "INDEX" use scaled indexing when computing an address.
- Mnemonics containing "\_DISP" use scaled displacements when computing an address.

# 11.3 Pseudo-code Operators

Infix operators in expressions have the precedence and associativity customary to the C language. Parentheses are used in complex expressions to avoid confusion. The operator ";" is lowest precedence and separates sequentially executed statements<sup>1</sup>. In addition, the operators shown in Figure 11.2 are used.

Note that in ISP ";" means parallel execution of the statements. We adopt the conventional Algol semantics.

```
assignment
store
         store to memory
load
         load from memory
         logical or
V
٨
         logical and
         logical exclusive-or
Û
         addition
         subtraction
         multiplication
         division
min
         minimum value
         maximum value
max
         square root
tally
         the number of 1 bits
€
         is a member of
≫α
         shift right, with sign bit filling
        shift right, with 0 filling
≫
«
         shift left, with 0 filling
        rotate right
        rotate left
[i \dots j]
        range i, i+1, \ldots, j-1, j
         bit number i from a
a_i
         bits in the range r from a
a+
```

FIGURE 11.2: Pseudo-code Operators

(BIT\_AND 
$$t u v$$
)
$$t - u \wedge v$$
(BIT\_AND  $x y z$ )
$$x - y \wedge z$$
(BIT\_AND\_TEST  $t u v$ )
$$t - u \wedge v$$
(BIT\_AND\_TEST  $x y z$ )
$$x - y \wedge z$$
(BIT\_AND\_TEST  $x y z$ )
$$x - y \wedge z$$
(BIT\_AND\_TEST  $x y z$ )
$$x - y \wedge z$$
(Constant  $x y z$ )
$$x - y \wedge z$$

These operations compute bitwise and.

BIT\_AND\_TEST never generates overflow/NaN or carry.

## **RAISES**

(nothing)

(BIT\_IMP 
$$t \ u \ v$$
)
$$t = -u \lor v$$
(BIT\_IMP\_TEST  $t \ u \ v$ )
$$t = -u \lor v$$

$$t = -u \lor v$$

These operations compute bitwise implication.

BIT\_IMP\_TEST never generates overflow/NaN or carry.

# **RAISES**

(nothing)

(BIT\_LEFT\_ONES 
$$x y$$
)
$$x - \min\{i | (y \ll i) \ge 0\}$$
(BIT\_LEFT\_ONES\_TEST  $x y$ )
$$x - \min\{i | (y \ll i) \ge 0\}$$
(BIT\_LEFT\_ZEROS  $x y$ )
$$x - 64 - \min\{i | (y \gg i) = 0\}$$
(BIT\_LEFT\_ZEROS\_TEST  $x y$ )
$$x - 64 - \min\{i | (y \gg i) = 0\}$$

$$x - 64 - \min\{i | (y \gg i) = 0\}$$

$$x - 64 - \min\{i | (y \gg i) = 0\}$$

These operations respectively return the number of consecutive 1- or 0-bits on the left end of the word in y.

The \_TEST versions of these operations generate carry when the result is 64 and never generate overflow/NaN.

### EXAMPLES

A linear search for the leftmost 0-bit in a contiguous block of words pointed to by p could be done using the loop shown below. The code returns the bit offset from the leftmost bit of the vector, and assumes that a zero will eventually be found.

```
(INST 0 (LOAD n p) (TARGET_DISP t_loop loop) (REG_MOVE bn r0))

(INST 7 (INT_ADD_IMM p p 8) (BIT_LEFT_ONES_TEST b n))

(INST 0 (LOAD n p)

(INT_ADD bn bn b)

(JUMP IF_C c0 t_loop))
```

RAISES

(nothing)

SEE ALSO

BIT\_RIGHT\_

BIT\_LEFT\_

(BIT\_MASK t top bot)

for 
$$i \in [0...63]: t_i - ("top \ge i) \oplus (i \ge "bot) \oplus ("top \ge "bot)$$

{where  $"top \in [0...63]$ .  $"bot \in [0...63]$ }

A mask is generated that contains 1-bits from bit positions [bot...top] and 0-bits elsewhere. If top is less than bot then the bit positions set to 1 are [0...top] and [bot...63], generating a complement mask.

## **EXAMPLES**

A mask containing ones in bit positions [21...46] and zeros elsewhere is generated by (BIT\_MASK t 46 21). Its complement is generated by (BIT\_MASK t 20 47).

**RAISES** 

(nothing)

SEE ALSO

**INT\_IMM** 

Bit Operations

(BIT\_MAT\_OR 
$$t \ u \ v$$
)

for  $i,j \in [0 \dots 7] : t_{8=i+j} - \bigvee_{k=0}^{7} (u_{8=i+k} \wedge v_{8=k+j})$ 

(BIT\_MAT\_TRANSPOSE  $t \ u$ )

for  $i,j \in [0 \dots 7] : t_{8=i+j} - u_{8=j+i}$ 

(BIT\_MAT\_XOR  $t \ u \ v$ )

for  $i,j \in [0 \dots 7] : t_{8=i+j} - \bigoplus_{k=0}^{7} (u_{8=i+k} \wedge v_{8=k+j})$ 

These operations provide the basic support for multiply and transpose of bit matrices. Each byte of a word represents a row of an  $8 \times 8$  matrix. Matrices of arbitrary size can be represented as matrices of  $8 \times 8$  blocks.

### **EXAMPLES**

The word  $8040201008040201_{16}$  is the identity matrix. For either BIT\_MAT\_XOR or BIT\_MAT\_OR the matrix  $0102040810204080_{16}$  in u will reverse the bytes in v, leaving the bit order unchanged. The same matrix in v will reverse the bits in each byte of u, leaving the byte order unchanged.

#### **RAISES**

(nothing)

SEE ALSO

BIT\_OR, BIT\_AND, BIT\_XOR

BIT\_MAT\_

(BIT\_MERGE 
$$t \ u \ v \ w$$
)

for  $i \in [0 \dots 63] : t_i - \text{if } w_i \text{ then } u_i \text{ else } v_i$ 

(BIT\_MERGE\_TEST  $t \ u \ v \ w$ )

for  $i \in [0 \dots 63] : t_i - \text{if } w_i \text{ then } u_i \text{ else } v_i$ 
 $\underbrace{ t_i u_j v_j w_j 26}_{27 \ 32 \ 27 \ 21 \ 0} A$ 

for  $i \in [0 \dots 63] : t_i - \text{if } w_i \text{ then } u_i \text{ else } v_i$ 

These operations select a bit from u if the corresponding bit in w is set; otherwise the bit from v is selected.

BIT\_MERGE\_TEST never generates overflow/NaN or carry.

**RAISES** 

(nothing)

Bit Operations

(BIT\_NAND 
$$t \ u \ v$$
)
$$t - \neg (u \land v)$$
(BIT\_NAND\_TEST  $t \ u \ v$ )
$$t - \neg (u \land v)$$

$$(bit_{-1} \land v)$$

$$t - \neg (u \land v)$$

$$(bit_{-1} \land v)$$

These operations compute bitwise negated and.

BIT\_NAND\_TEST never generates overflow/NaN or carry.

RAISES

(nothing)

BIT\_NAND\_

(BIT\_NIMP 
$$t u v$$
)
$$t - u \wedge \neg v$$
(BIT\_NIMP  $x y z$ )
$$x - y \wedge \neg z$$
(BIT\_NIMP\_TEST  $t u v$ )
$$t - u \wedge \neg v$$
(BIT\_NIMP\_TEST  $x y z$ )
$$t - u \wedge \neg v$$
(BIT\_NIMP\_TEST  $x y z$ )
$$t - u \wedge \neg v$$
(BIT\_NIMP\_TEST  $x y z$ )
$$t - u \wedge \neg z$$
(BIT\_NIMP\_TEST  $x y z$ )

These operations compute bitwise negated implication.

BIT\_NIMP\_TEST never generates overflow/NaN or carry.

# RAISES

(nothing)

(BIT\_NOR 
$$t \ u \ v$$
)
$$t \to \neg (u \lor v)$$
(BIT\_NOR\_TEST  $t \ u \ v$ )
$$t \to \neg (u \lor v)$$

$$A \to \neg (u \lor v)$$

These operations compute bitwise negated or.

BIT\_NOR\_TEST never generates overflow/NaN or carry.

**RAISES** 

(nothing)

BIT\_NOR\_

(BIT\_ODD\_AND 
$$t \ u \ v$$
)
$$t - (-1) = \bigoplus_{j=0}^{63} u_j \wedge v_j$$
(BIT\_ODD\_AND\_TEST  $t \ u \ v$ )
$$t - (-1) = \bigoplus_{j=0}^{63} u_j \wedge v_j$$
(BIT\_ODD\_NIMP  $t \ u \ v$ )
$$t - (-1) = \bigoplus_{j=0}^{63} u_j \wedge v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) = \bigoplus_{j=0}^{63} u_j \wedge \neg v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \wedge \neg v_j$$
(BIT\_ODD\_OR  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \wedge v_j$$
(BIT\_ODD\_OR  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_OR  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_NIME  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$
(BIT\_ODD\_NIMP\_TEST  $t \ u \ v$ )
$$t - (-1) * \bigoplus_{j=0}^{63} u_j \vee v_j$$

These operations do a two-operand bitwise operation and compute the parity of the result. The value stored in t is either all 1's or all 0's.

### RAISES

(nothing)

(BIT\_OR 
$$t \ u \ v$$
)
$$t - u \lor v$$
(BIT\_OR  $x \ y \ z$ )
$$x - y \lor z$$
(BIT\_OR\_TEST  $t \ u \ v$ )
$$t - u \lor v$$
(BIT\_OR\_TEST  $t \ u \ v$ )
$$t - u \lor v$$
(BIT\_OR\_TEST  $t \ u \ v$ )
$$t - u \lor v$$
(BIT\_OR\_TEST  $t \ u \ v$ )
$$t - u \lor v$$
(BIT\_OR\_TEST  $t \ u \ v$ )
$$t - u \lor v$$

These operations compute bitwise or.

BIT\_OR\_TEST never generates overflow/NaN or carry.

**RAISES** 

(nothing)

BIT\_OR\_

These operations are used for packing bit fields from v into t under control of the mask found in u. Bits from v selected by zeros in u are packed consecutively into bit positions in the destination register t, starting from the right. Unfilled positions in t are set to zero.

#### **EXAMPLES**

This example only shows 8 bits, and describes the operations for BIT\_PACK. Capital letters stand for arbitrary bit values.

| BIT_PACK | register       | what                 |
|----------|----------------|----------------------|
| 11011001 | u              | source selector mask |
| RSTVWXYZ | $oldsymbol{v}$ | source               |
| 00000TXY | t              | destination          |

Bits T, X and Y are selected by the source mask, and are written into the destination register t in that order, packed to the right, with zeros filled in on the left.

RAISES

(nothing)

SEE ALSO

BIT\_UNPACK\_1

(BIT\_RIGHT\_ONES 
$$x y$$
)
 $x - \min\{i | (y \gg i) \text{ is even}\}$ 

(BIT\_RIGHT\_ONES\_TEST  $x y$ )
 $x - \min\{i | (y \gg i) \text{ is even}\}$ 

(BIT\_RIGHT\_ZEROS  $x y$ )
 $x - 64 - \min\{i | (y \ll i) = 0\}$ 

(BIT\_RIGHT\_ZEROS\_TEST  $x y$ )
 $x - 64 - \min\{i | (y \ll i) = 0\}$ 

(BIT\_RIGHT\_ZEROS\_TEST  $x y$ )
 $x - 64 - \min\{i | (y \ll i) = 0\}$ 

These operations respectively return the number of consecutive 1- or 0-bits on the right end of the word in y.

The \_TEST versions of these operations generate carry when the result is 64 and never generate overflow/NaN.

RAISES

(nothing)

SEE ALSO

BIT\_LEFT\_

BIT\_RIGHT\_

(BIT\_TALLY 
$$t \ u$$
)
$$t - \sum_{j=0}^{63} u_j$$
(BIT\_TALLY\_TEST  $t \ u$ )
$$t - \sum_{j=0}^{63} u_j$$

$$(BIT_TALLY_TEST \ t \ u)$$

$$t - \sum_{j=0}^{63} u_j$$

These operations count the number of 1-bits in the u register value. The TEST versions never generate overflow/NaN and generate carry when t=1.

RAISES

(nothing)

Bit Operations

```
(BIT_UNPACK_1 t u v)
                                                                                           t_{47} u_{47} v_{37} 15_{27} 00
          for i \in \{0...63\}
                shift_i - i + 1 - tally (\neg [u_i \dots u_0])
                s_i - shift_i \wedge 63
                d_i - shift_i \wedge 15
               t_{i-d_i} - v_{i-s_i}
 (BIT_UNPACK_2 t u v)
                                                                                          t_{47} u_{47} v_{37} 16_{27} 00...
         for i \in [0...63]
               shift_i - i + 1 - tally (\neg [u_i \dots u_0])
               s_i - shift_i \wedge 15
               d_i - shift_i \wedge 3
              t_{i-d_i} \leftarrow v_{i-s_i}
(BIT_UNPACK_3 t u v)
                                                                                          t_{47} = t_{47} = t_{42} = t_{37} = t_{32} = t_{27} = t_{21} = t_{0}
        for i \in [0...63]
              shift_i - i + 1 - tally (\neg [u_i \dots u_0])
              s_i - shift_i \wedge 3
              d_i - shift_i \wedge 0
              t_{i-d_i} \leftarrow \neg u_i \wedge v_{i-s_i}
        }
```

The BIT\_UNPACK\_1, BIT\_UNPACK\_2, BIT\_UNPACK\_3 operation sequence is used for unpacking bit data in v into result t under control of the mask found in u.

Using a fixed mask and the result of BIT\_UNPACK\_1 as the input data for BIT\_UNPACK\_2 (and BIT\_UNPACK\_2 for BIT\_UNPACK\_3), bits from v are packed consecutively into bit positions in the destination register t selected by zeros in u. Unselected positions in t are set to zero. Extra bits from v are discarded. The operation packs from right to left.

### **EXAMPLES**

This example only shows 8 bits, and describes the operations for the three-operation bit unpack sequence. Capital letters stand for arbitrary bit values.

| bit unpack | register         | what                      |
|------------|------------------|---------------------------|
| 01001110   | u                | destination selector mask |
| RSTVWXYZ   | $\boldsymbol{v}$ | source                    |
| WOXYOOOZ   | t                | destination               |

Bits W, X, Y and Z are written into the destination register t in that order, but in positions selected by the mask in u.

RAISES

(nothing)

SEE ALSO

BIT\_PACK

BIT\_UNPACK\_

(BIT\_XNOR 
$$t \ u \ v$$
)
$$t = -(u \oplus v)$$
(BIT\_XNOR\_TEST  $t \ u \ v$ )
$$t = -(u \oplus v)$$

$$t = -(u \oplus v)$$
A
$$t = -(u \oplus v)$$
A
$$t = -(u \oplus v)$$

These operations compute bitwise negated exclusive-or.

BIT\_XNOR\_TEST never generates overflow/NaN or carry.

**RAISES** 

(nothing)

(BIT\_XOR 
$$t \ u \ v$$
)
$$t - u \oplus v$$
(BIT\_XOR  $x \ y \ z$ )
$$x - y \oplus z$$
(BIT\_XOR\_TEST  $t \ u \ v$ )
$$t - u \oplus v$$
(BIT\_XOR\_TEST  $t \ u \ v$ )
$$t - u \oplus v$$
(BIT\_XOR\_TEST  $t \ u \ v$ )
$$t - u \oplus v$$
(BIT\_XOR\_TEST  $t \ u \ v$ )
$$t - u \oplus v$$
(BIT\_XOR\_TEST  $t \ u \ v$ )
$$t - u \oplus v$$

These operations compute bitwise exclusive-or.

BIT\_XOR\_TEST never generates overflow/NaN or carry.

**RAISES** 

(nothing)

BIT\_XOR\_

Raise privileged operation exception

This operation raises a privileged operation exception. It may be used by the debugger to implement breakpoints.

# **RAISES**

privileged

(CLOCK 
$$z y$$
)
 $z = \operatorname{clock} - y$ 

CLOCK  $z y$ 

This operation returns the contents of the 64-bit clock register, which increments by one on each clock tick. Normally, the clock register contents are synchronized across all processors in a system.

RAISES (nothing) SEE ALSO §10.3

CLOCK

(COUNT\_CONCURRENCY 
$$t$$
)
$$t - (concurrency counter)_D$$
(COUNT\_EVENTS  $t$   $ec$ )
$$t - (event counter at  $ec$ )_D
(COUNT_ISSUES  $t$ )
$$t - (instruction issue counter)_D$$
(COUNT_MEMREFS  $t$ )
$$t - (memory reference counter)_D$$
(COUNT_STREAMS  $t$ )
$$t - (stream counter)_D$$$$

These operations read one of the eight counters in the protection domain D of the executing stream. The four event counters each can be set independently to one of the count sources by the supervisor-privileged COUNT\_SELECT\_RESTORE operation. Each event counter has an eight-bit CountSource tag that determines what is to be counted. These tags are packed in the count\_select register. A description of the counters and the encoding of the CountSource tag is described in §10.2.

RAISES

(nothing)

SEE ALSO

COUNT\_SELECT\_RESTORE, CLOCK, STREAM\_COUNT\_INST

(COUNT\_SELECT\_RESTORE 
$$u$$
)

(event counter select)<sub>D</sub>  $-u$ 

$$t - (event counter select)D$$

$$t - (event counter select)D$$

$$t - (event counter select)D$$

These operations allow the event counter select register to be read and written. The COUNT-SELECT\_RESTORE operation requires supervisor privilege. This register contains the four select tags for the programmable event counters, described in §10.2.

RAISES

privileged

SEE ALSO

COUNT\_

COUNT\_SELECT\_

(COUNT\_PHANTOMS 
$$t$$
)
$$t = (processor phantom counter)$$

$$t = (processor phantom counter)$$

$$t = (processor ready  $t$ )
$$t = (processor ready counter)$$

$$A = (processor ready counter)$$$$

These operations read the processor's phantom and ready counters. The phantom counter sums the number of ticks where no stream was ready, so that the processor utilization may be measured. The ready counter sums the number of ready but not issuing streams at each tick, so that the average ready pool size and average waiting time may be measured.

RAISES

(nothing)

SEE ALSO

CLOCK

flush (data map at s) from data map cache

flush any (data map at s) from data map cache

These are supervisor-privileged operations to maintain consistency in the data address translation cache.

The domain in Bits 63-60 and segment in Bits 41-28 of s address the data map entry; other bits of s are ignored. A violation of the map limit will not raise a data map limit exception. The DATA\_MAP\_FLUSH operation is used to flush a single map entry, as after changing the data map in data memory. The DATA\_MAP\_FLUSH\_ANY operation is used to flush any map entry for the specified domain from the cache. Since the cache is not fully associative, up to 512 flushes may be required—each flush should specify a different segment modulo 512. See §6.2.

These operations are subject to the min\_dkill level (see §8.4) based on the domain. Therefore, it may be necessary to perform a DATA\_STATE\_RESTORE on the specified domain prior to flushing.

RAISES

data\_prot, privileged
SEE ALSO
PROGRAM\_MAP\_

DATA\_MAP\_

(DATA\_OPA\_SAVE 
$$r$$
 opno)
$$r - \text{address state of operation } opno$$
(DATA\_OPD\_SAVE  $r$  opno)
$$r - \text{data state of operation } opno$$
(DATA\_OPD\_REDO  $r$  s)
$$r - \text{data state of operation } opno$$
(DATA\_OP\_REDO  $r$  s)
$$r - \text{data state of operation } opno$$

$$r - \text{data state of operation } opno$$
(DATA\_OP\_REDO  $r$  s)
$$r - \text{data state of operation } opno$$
(DATA\_OP\_REDO  $r$  s)
$$r - \text{data state of operation } opno$$

perform memory operation in s with data to or from r

These operations save and (re)execute failed memory references. They are normally used by trap handlers. A stream may have eight memory references pending in the M-unit. Each reference is described by two words. One word contains the data value (if any), and the other contains address and control information. The data value must be read after the address word. The eight data result fields of the result code register describe which of these memory references are exceptional after a trap.

DATA\_OPA\_SAVE retrieves a Data Control Descriptor containing address and control information from the M-unit for the operation denoted by opno and places that information into register r (see  $\S6.3$ ). To allow an opno of zero to select the descriptor which must be saved first, the lookahead index is added to opno modulo eight. DATA\_OPD\_SAVE retrieves the corresponding data from the M-unit for the operation denoted by opno and places that information into register r. Note that DATA\_OPA\_SAVE must be performed before DATA\_OPD\_SAVE for each opno.

The DATA\_OP\_REDO operation re-executes an M-unit operation, given a Data Control Descriptor in register s and the corresponding data in register r. The "original" register number in field "dest\_reg" of s is ignored; a load operation will place its result in register r. DATA\_OP\_REDO raises data memory exceptions (and otherwise behaves) just as the "original" operation, as described by r and s, would have. If the value in s does not indicate one of the defined operations for the Data Control Descriptor, a data\_prot exception is raised, and the result code is set to DR\_UNIMPLEMENTED\_OP.

Since the value of s must be interpreted before it is known whether the DATA\_OP\_REDO operation will actually read or write r, poison is checked for reading r.

#### RAISES

data\_prot, data\_alignment, data\_blocked SEE ALSO

EXCEPTION, RESULTCODE, §6.3

$$\underset{64}{\dots}\underset{61}{0}\underset{56}{s}\underset{51}{F}\underset{47}{\dots}\underset{21}{00}\underset{16}{00}\underset{11}{04}\underset{6}{02}\underset{0}{0} \quad MC$$

(data state descriptor for domain in s) — s

This supervisor-privileged operation is used to set the data state descriptor.

The value s is the data state descriptor: see §8.4. The Bits 63-60 of s specify the protection domain. i.e. the data state descriptor is self-tagged.

RAISES

privileged
SEE ALSO
PROGRAMLSTATE

DATA\_STATE\_

```
(DOMAIN_ENTER)
                                                 0 0 00 00 0D 0A ...
     if limbo > 0 then
        limbo - limbo - 1:
         SRES_D - SRES_D + 1:
         SCUR_D - SCUR_D + 1:
     else
        raise create exception
     end
                                                  00000000000
(DOMAIN_LEAVE u)
     limbo - limbo + 1:
     SCUR_D - SCUR_D - 1;
     SRES_D - SRES_D - 1;
     if limbo >= 128 then
        raise create exception
     end
```

These are a supervisor-privileged operations to change protection domains. No lookahead is allowed across a DOMAIN\_LEAVE/DOMAIN\_ENTER pair.

Executing a DOMAIN\_EAVE. DOMAIN\_ENTER sequence will change the domain to the specified value. The protection domain D of the stream changes to that specified in Bits 3-0 of register u. The current pc address in the SSW must map to the same page in both the new and old domains.

The create exception protects against a DOMAIN\_ENTER without a matching DOMAIN\_LEAVE, or too many domain changes in progress at once.

```
RAISES

privileged, create
SEE ALSO

LEVEL_RTN
```

# (DOMAIN\_IDENTIFIER\_SAVE t)

t — D. the current protection domain identifier

This is a supervisor-privileged operation used to retrieve the identity of the current protection domain D.

**RAISES** 

privileged

DOMAIN\_ID\_

These operations manipulate the exception register, which contains the exception bits that record unusual and possibly significant events resulting from instruction execution. Every exception bit causes a trap unless that trap is disabled by the appropriate bit in the trap mask of the ssw.

Additional information about the most recent floating-point and memory exceptions are found in the result code register.

RAISES
(nothing)
SEE ALSO

Float Operations

RESULTCODE, §9.1

(FLOAT\_CEIL 
$$t \ u$$
)

 $t - (\text{float ceiling of float } u) * 2^{-1074}$ 

(FLOAT\_CHOP  $t \ u$ )

 $t - (\text{float integer chop of float } u) * 2^{-1074}$ 

(FLOAT\_FLOOR  $t \ u$ )

 $t - (\text{float floor of float } u) * 2^{-1074}$ 

(FLOAT\_NEAR  $t \ u$ )

 $t - (\text{float integer nearest float } u) * 2^{-1074}$ 

(FLOAT\_ROUND  $t \ u$ )

 $t - (\text{float integer nearest float } u) * 2^{-1074}$ 

(FLOAT\_ROUND  $t \ u$ )

 $t - (\text{float integer round of float } u) * 2^{-1074}$ 

These operations scale 64-bit floating-point numbers by  $2^{-1074}$ , then round the result into floating-point numbers. Since the scaling reduces 1.0 to the minimum denormalized number, the effect is to round the argument to an integer. The roundings are directed as in IEEE Standard 754. FLOAT\_ROUND uses the rounding mode in the ssw.

Note that the float\_inexact exception is never raised, since it is expected for these operations.

RAISES
(nothing)
SEE ALSO
INT\_UNS\_

FLOAT\_

(FLOAT\_ADD 
$$t$$
  $u$   $v$ )
$$t - u + v$$
, floating point

(FLOAT\_ADD  $x$   $y$   $z$ )
$$x - y + z$$
, floating point

$$t - u + v$$
, floating point

This operation computes the floating-point sum of two numbers.

**RAISES** 

float\_invalid, float\_overflow, float\_inexact

COUNTS AS

CNT\_FLOAT\_ADD, CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_ADD\_MUL

Float Operations

$$t_{47} v_{42} v_{37} v_{32} v_{27} v_{21}$$
 A

t - u + v \* w, floating point

This operation does a floating-point multiply followed by a floating-point add, counting as two floating-point operations. If u is register 0, only a floating-point multiply is performed (to preserve the sign of a zero result).

Only one rounding operation is performed, enhancing accuracy and greatly facilitating doubled precision operations.

#### **RAISES**

float\_invalid, float\_overflow, float\_inexact, float\_underflow COUNTS AS

CNT\_FLOAT\_MUL, CNT\_FLOAT\_ADD if  ${}^{t}u \neq 0$ , CNT\_FLOAT\_TOTAL SEE ALSO

FLOAT\_ADD, FLOAT\_SUB\_MUL, FLOAT\_SUB\_MUL\_REV. INT\_ADD\_MUL, §12.4

FLOAT\_ADD\_MUL\_

```
(FLOAT_APPROX_RESTORE y)
                                                                                     y_1 = 00_{16} y_1 = 01_{10} 00_{10}
       index - y_{51...45}
       for i \in [0...7]:
            rbase_i - y_{2=i}
            qbase_i - y_{2=i+1}
      for i \in [8...15]:
           rbase_i - y_{2-i+2}
            qbase_i - y_{2=i+3}
      rbase<sub>16</sub> — y<sub>36</sub>
      qbase_{16} - y_{37}
      for i \in [0...2]:
           rslope, — y_{2=i+38}
           qslope_i - y_{2+i+39}
      for i \in [3...6]:
           rslope_i - y_{2=i+46}
           qslope_i - y_{2=i+47}
      recip base index - rbase
      rsqrt baseinder - qbase
      recip slope index - rslope
      rsqrt slope - qslope
         {where y_{16} = parity(rbase_{0..7}), y_{17} = parity(qbase_{0..7}),
      y_{34} = parity(rbase_{8..15}), y_{35} = parity(qbase_{8..15}),
      y_{60} = parity(rbase_{16}|rslope), y_{61} = parity(qbase_{16}|qslope)
```

This IPL-privileged operation is used to initialize the floating-point reciprocal and reciprocal square root approximation tables.

```
RAISES
(nothing)
SEE ALSO
FLOAT_RECIP_APPROX, FLOAT_RSQRT_APPROX
```

(FLOAT\_CMP\_TEST 
$$t$$
  $u$   $v$ )
$$t - u - v$$
. floating point

(FLOAT\_CMP\_TEST  $x$   $y$   $z$ )
$$x \leftarrow y - z$$
, floating point

This operation computes floating-point comparison.

FLOAT\_CMP\_TEST generates overflow/NaN if either operand is NaN, and raises float\_invalid. The condition code is zero if the two operands are equal, negative if the first  $(u \lor y)$  is less than the second  $(v \lor z)$ , and positive if the first is greater than the second. Carry is set if and only if  $v \lor z$  is NaN.

RAISES
float\_invalid
COUNTS AS
CNT\_FLOAT\_ADD, CNT\_FLOAT\_TOTAL
SEE ALSO
FLOAT\_MAX\_TEST, FLOAT\_MIN\_TEST, §5.1

FLOAT\_CMP\_

(FLOAT\_DIV 
$$t$$
  $u$   $v$   $w$ )
$$t - u + v * w$$
, floating point
$$A$$

This operation is used to complete the floating-point division of u by v. The reciprocal in w is a SpecialFloat64 as delivered by FLOAT\_ITER. The inexact exception arising from the division of two floating-point numbers will be raised by this operation.

#### RAISES

float\_overflow, float\_underflow, float\_inexact
COUNTS AS
CNT\_FLOAT\_DIV, CNT\_FLOAT\_TOTAL
SEE ALSO
FLOAT\_DIV\_APPROX, §12.5

(FLOAT\_DIV\_APPROX 
$$t \ u \ v \ w$$
)
$$exp - \text{unbiased exponent of } u:$$

$$t - v * w/2^{exp-52}, \text{ floating point, round to nearest}$$
(FLOAT\_SQRT\_APPROX\_TEST  $t \ u \ v \ w$ )
$$exp - \text{unbiased exponent of } u:$$

$$t - v * w/2^{\lfloor exp/2 \rfloor}, \text{ floating point, round to nearest}$$
A
$$exp - \text{unbiased exponent of } u:$$

$$t - v * w/2^{\lfloor exp/2 \rfloor}, \text{ floating point, round to nearest}$$

These operations perform a floating-point multiply of SpecialFloat64 w with Float64 v, with round to nearest. They are used for floating-point division and square root computation. A float\_invalid exception is raised for 0/0, infinity/infinity, and square root of negative. A float\_inexact exception is raised only in conjunction with float\_overflow. The condition code produced by FLOAT\_SQRT\_APPROX\_TEST is undefined.

#### **RAISES**

float\_invalid, float\_zero\_divide, float\_overflow, float\_inexact
COUNTS AS
CNT\_FLOAT\_TOTAL
SEE ALSO
FLOAT\_DIV, FLOAT\_SQRT, FLOAT\_ITER, §12.5

FLOAT\_DIV\_APPROX

```
(FLOAT_DIV_ERROR t \ u \ v \ w)

exp — unbiased exponent of v:

t - (u - v * w)/2^{exp-52}, floating point, round to nearest

(FLOAT_SQRT_ERROR_TEST t \ u \ v \ w)

exp — unbiased exponent of u:

t - 0.5 * (u - v * w)/2^{[exp/2]}, floating point, round to nearest
```

These operations perform a floating-point subtract of u with the product of v and w, using round to nearest. The result is scaled to avoid undesirable overflow or underflow. They are used for floating-point division and square root computation.

FLOAT\_DIV\_ERROR returns a zero with the sign of u when u is infinity or NaN, or v is infinity, NaN, or zero, or w is infinity or NaN. When the rounded result would be zero and inexact, it is rounded away from zero to produce the minimum denorm floating-point number. FLOAT\_SQRT\_ERROR\_TEST returns a positive zero when v is infinity, NaN, or zero. The condition code produced is undefined.

```
(nothing)
COUNTS AS
CNT_FLOAT_TOTAL
SEE ALSO
FLOAT_DIV, FLOAT_SQRT, FLOAT_ITER, §12.5
```

RAISES

(FLOAT\_INT 
$$t$$
  $u$ )
$$t - \text{float of integer } u$$

$$t - t_{47} u_{37} 18_{32} 12_{27} 08_{21} \dots_{0}$$

This operation converts an integer into a floating-point number, rounding according to the current rounding mode in the ssw.

RAISES
float\_inexact
SEE ALSO
FLOAT\_UNS

FLOAT\_INT\_

(FLOAT\_ITER 
$$t \ u \ v \ w$$
)
$$t - u + v * w$$
, floating point, round to nearest

In use, u and w are an extended precision (SpecialFloat64) reciprocal whose accuracy is increased by cancelling out the relative error given by the floating-point number v. The result is stored as a SpecialFloat64. FLOAT\_ITER is used in both floating-point and integer division and square root computations.

RAISES

(nothing)

SEE ALSO

FLOAT\_RECIP\_APPROX, FLOAT\_DIV\_APPROX, §12.5

$$(FLOAT\_MAX t u v)$$

 $t - \max(u, v)$ , floating point

### (FLOAT\_MAX\_TEST t u v)

$$t \leftarrow \max(u, v)$$
, floating point

$$t_{47} u_{47} t_{37} 13_{27} 0E_{21} 0E$$

$$t_{64} \cdots t_{47} u_{42} v_{37} t_{32} t_{27} t_{21} \cdots t_{0}$$
 A

These operations select the larger of the two floating-point operands. If both operands are NaN, u is selected; if only one is NaN, the other (non-NaN) operand is selected. If both operands are zero, a positive zero is selected if one is present. See §5.1.

FLOAT\_MAX\_TEST generates overflow/NaN if either operand is NaN. The condition code is zero if the two operands are equal. negative if the first (u) is less than the second (v), and positive if the first is greater than the second. Carry is set if and only if v is NaN.

RAISES

(nothing)

COUNTS AS

CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_MIN, SELECT\_FLOAT, INT\_MAX

FLOAT\_MAX\_

(FLOAT\_MIN 
$$t \ u \ v$$
)
$$t = \min(u, v), \text{ floating point}$$

$$(FLOAT_MIN_TEST \ t \ u \ v)$$

$$t = \min(u, v), \text{ floating point}$$

$$t = \min(u, v), \text{ floating point}$$

$$t = \min(u, v), \text{ floating point}$$

These operations select the smaller of the two floating-point operands. If both operands are NaN, u is selected; if only one is NaN, the other (non-NaN) operand is selected. See §5.1.

FLOAT\_MIN\_TEST generates overflow/NaN if either operand is NaN. The condition code is zero if the two operands are equal, negative if the first (u) is less than the second (v), and positive if the first is greater than the second. Carry is set if and only if v is NaN. If both operands are zero, a negative zero is selected if one is present.

RAISES
(nothing)
COUNTS AS
CNT\_FLOAT\_TOTAL
SEE ALSO
FLOAT\_MAX, SELECT\_FLOAT, INT\_MIN

(FLOAT\_MMAX 
$$t$$
  $u$   $v$ )
$$t - \text{if } abs(u) \ge abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$

$$(FLOAT_MMAX_TEST t u v)$$

$$t - \text{if } abs(u) \ge abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$

$$(FLOAT_MMAX_TEST t u v)$$

$$t - \text{if } abs(u) \ge abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$

$$A$$

These operations select the larger in magnitude of the two floating-point operands. If both operands are NaN, u is selected; if only one is NaN, the other (non-NaN) operand is selected. If the operands have equal magnitude, u is selected.

FLOAT\_MMAX\_TEST generates overflow/NaN if either operand is NaN. The condition code is zero if the two operands are equal in magnitude, negative if the first (u) is smaller than the second (v), and positive if the first is larger than the second. Carry is set if and only if v is NaN.

RAISES

(nothing)

COUNTS AS

CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_MMIN, SELECT\_FLOAT. FLOAT\_MAX

FLOAT\_MMAX

(FLOAT\_MMIN 
$$t \ u \ v$$
)
$$t - \text{if } abs(u) < abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$
(FLOAT\_MMIN\_TEST  $t \ u \ v$ )
$$t - \text{if } abs(u) < abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$

$$t - \text{if } abs(u) < abs(v) \text{ then } u \text{ else } v \text{ end, floating point}$$

These operations select the smaller in magnitude of the two floating-point operands. If both operands are NaN, v is selected: if only one is NaN, the other (non-NaN) operand is selected. If the operands have equal magnitude, v is selected.

FLOAT\_MMIN\_TEST generates overflow/NaN if either operand is NaN. The condition code is zero if the two operands are equal in magnitude, negative if the first (u) is smaller than the second (v), and positive if the first is larger than the second. Carry is set if and only if v is NaN.

RAISES

(nothing)

COUNTS AS

CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_MMAX, SELECT\_FLOAT, FLOAT\_MIN

This operation performs a floating-point multiply followed by a floating-point subtraction. It differs from FLOAT\_SUB\_MUL\_REV in the response to exceptions. In particular, when u, v, or w is infinity or NaN, the result is zero with no exception. Normally, an infinity or NaN would be returned and an exception possibly raised. In particular, if v \* w rounds to infinity and u is that infinity, 0 rather than NaN is generated in t and no exception is raised.

Only one rounding operation is performed, enhancing accuracy and greatly facilitating doubled precision operations.

```
RAISES
```

float\_underflow, float\_overflow, float\_inexact COUNTS AS CNT\_FLOAT\_MUL, CNT\_FLOAT\_ADD if  $u \neq 0$ , CNT\_FLOAT\_TOTAL SEE ALSO §12.4

FLOAT\_MUL\_LOWER\_

(FLOAT\_RECIP\_APPROX 
$$x$$
  $y$ )

 $exp$  — unbiased exponent of  $y$ ;

 $x$  — an approximation to  $2^{exp-52}/y$ , floating point

(FLOAT\_RECIP\_APPROX\_TEST  $x$   $y$ )

 $exp$  — unbiased exponent of  $y$ ;

 $x$  — an approximation to  $2^{exp-52}/y$ , floating point

(FLOAT\_RSQRT\_APPROX  $x$   $y$ )

 $exp$  — unbiased exponent of  $y$ ;

 $x$  — an approximation to  $2^{[exp/2]}/\sqrt{y}$ , floating point

(FLOAT\_RSQRT\_APPROX\_TEST  $x$   $y$ )

 $exp$  — unbiased exponent of  $y$ ;

 $x$  — an approximation to  $2^{[exp/2]}/\sqrt{y}$ , floating point

(FLOAT\_RSQRT\_APPROX\_TEST  $x$   $y$ )

 $exp$  — unbiased exponent of  $y$ ;

 $x$  — an approximation to  $2^{[exp/2]}/\sqrt{y}$ , floating point

These operations are used for computing floating-point reciprocals and reciprocal square roots. They perform a table lookup operation followed by a linear interpolation using an adder-multiplier. The table is in an internal format. The approximation is returned as a SpecialFloat64.

In FLOAT\_\*\_APPROX, if the y operand is denormalized, the float\_extension exception is raised and y is returned. When the y operand is denormalized with FLOAT\_\*\_APPROX\_TEST, they return y, set carry, and raise no exception. In either case, if the y operand is zero, y is returned.

RAISES
float\_extension
SEE ALSO
§12.5

FLOAT\_APPROX\_

```
(FLOAT_RECIP_ERROR t \ v \ w)
exp - \text{unbiased exponent of } v;
t - 1.0 - v * w/2^{exp-52}. \text{ floating point, round to nearest}
(FLOAT_RSQRT_ERROR_TEST t \ u \ v \ w)
exp - \text{unbiased exponent of } u;
t - 0.5 * (1.0 - v * w/2^{\lfloor exp/2 \rfloor}), \text{ floating point, round to nearest}
```

These operations are used for computing floating-point reciprocals and square roots. They perform a partial Newton's method iteration using the adder-multiplier, returning the relative error in the SpecialFloat64 reciprocal or reciprocal square root in w compared to the Float64 divisor or square root estimate in v. The condition code produced by FLOAT\_RSQRT\_ERROR\_TEST is undefined.

RAISES
(nothing)
COUNTS AS
CNT\_FLOAT\_TOTAL
SEE ALSO
INT\_RECIP\_ERROR, §12.5

$$t_{64} \cdots t_{47} t_{42} t_{37} t_{32} t_{27} t_{21} t_{0}$$
 A

 $t - v * 2^w$ , floating point

This operation is used to multiply the floating-point number v by a power of two selected by the signed integer in the low 13 bits of w.

RAISES

 ${\tt float\_overflow}, {\tt float\_underflow}, {\tt float\_inexact}$ 

COUNTS AS

CNT\_FLOAT\_TOTAL

SEE ALSO

INTLOGB

FLOAT\_SCALB

$$(FLOAT\_SQRT t u v w)$$

$$t u v u_{47,42} x_{37,32} x_{27,27} x_{21} A$$

t - u + v \* w. floating point

This operation is used to complete floating-point square root of the floating-point number in u, using the SpecialFloat64 reciprocal in w. The inexact exception arising from computing the square root of a floating-point number is raised by this operation.

RAISES

float\_inexact

COUNTS AS

CNT\_FLOAT\_SQRT, CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_RSQRT\_APPROX, §12.5

 $t_{64} \dots t_{47} u_{42} v_{37} v_{32} 11_{27} 0E_{21} \dots_{0}$ 

 $x_{64} \cdots x_{21} x_{16} y_{11} x_{6} 0 E_{0}$ 

C

t-u-v, floating point

(FLOAT\_SUB x y z)

 $x \leftarrow y - z$ , floating point

This operation computes floating-point subtraction.

**RAISES** 

float\_invalid.float\_overflow,float\_inexact

COUNTS AS

CNT\_FLOAT\_ADD, CNT\_FLOAT\_TOTAL

SEE ALSO

FLOAT\_SUB\_MUL, FLOAT\_SUB\_MUL\_REV

FLOAT\_SUB\_

(FLOAT\_SUB\_MUL 
$$t \ u \ v \ w$$
)
$$t - u - v * w, \text{ floating point}$$
(FLOAT\_SUB\_MUL\_REV  $t \ u \ v \ w$ )
$$t - v * w - u, \text{ floating point}$$

$$(FLOAT_SUB_MUL_REV \ t \ u \ v \ w)$$

$$t - v * w - u, \text{ floating point}$$

These operations perform a floating-point multiply followed by a floating-point subtraction. If u is register 0. only a floating-point multiply is performed (to preserve the sign of a zero result).

RAISES

float\_invalid, float\_overflow, float\_inexact, float\_underflow COUNTS AS

CNT\_FLOAT\_MUL, CNT\_FLOAT\_ADD if  $u \neq 0$ , CNT\_FLOAT\_TOTAL SEE ALSO

FLOAT\_ADD\_MUL, FLOAT\_SUB

 $t_{64} \cdots t_{47} u_{42} 18_{37} 18_{32} r_{27} 08_{21} \cdots A$ 

t — float of unsigned integer u

This operation converts unsigned integers into floating-point numbers, rounding according to the current rounding mode in the ssw.

RAISES
float\_inexact
SEE ALSO
FLOAT\_INT

FLOAT\_UNS\_

| (INT_CEIL t u)                                  |                                                                                                                                                                                   | A |
|-------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|
| t — ceiling of float $u$                        |                                                                                                                                                                                   |   |
| (INT_CEIL_TEST t u)                             | $\underbrace{f}_{64} \dots \underbrace{f}_{47} \underbrace{u}_{42} \underbrace{18}_{27} \underbrace{07}_{27} \underbrace{09}_{27} \dots \underbrace{0}_{21} \dots \underbrace{0}$ | A |
| $t \leftarrow$ ceiling of float $u$             |                                                                                                                                                                                   |   |
| (INT_CHOP t u)                                  |                                                                                                                                                                                   | A |
| t — integer chop of float $u$                   | 04 4, 42 3, 32 2/ 21 0                                                                                                                                                            |   |
| (INT_CHOP_TEST t u)                             | $\underbrace{t_{47}}_{47}\underbrace{t_{42}}_{37}\underbrace{t_{32}}_{27}\underbrace{05}_{27}\underbrace{09}_{21}\ldots_{0}$                                                      | A |
| $t \leftarrow \text{integer chop of float } u$  |                                                                                                                                                                                   |   |
| (INT_FLOOR t u)                                 | $_{64} \cdots _{47} \frac{t}{42} \frac{u}{37} 18_{32} 06_{27} 08_{21} \cdots _{0}$                                                                                                | A |
| $t \leftarrow$ floor of float $u$               |                                                                                                                                                                                   |   |
| (INT_FLOOR_TEST t u)                            | $ \underbrace{t_{47}}_{47} \underbrace{t_{42}}_{27} \underbrace{t_{32}}_{32} \underbrace{06_{27}}_{27} \underbrace{09_{21}}_{21} \dots \underbrace{0}_{0} $                       | A |
| t — floor of float $u$                          |                                                                                                                                                                                   |   |
| (INT_NEAR t u)                                  |                                                                                                                                                                                   | A |
| $t \leftarrow \text{integer nearest float } u$  |                                                                                                                                                                                   |   |
| (INT_NEAR_TEST t u)                             | $\underbrace{t_{47}}_{47}\underbrace{u_{37}}_{42}\underbrace{18}_{37}\underbrace{04}_{27}\underbrace{09}_{21}\underbrace{\dots}_{0}$                                              | A |
| $t \leftarrow \text{integer nearest float } u$  |                                                                                                                                                                                   |   |
| (INT_ROUND t u)                                 |                                                                                                                                                                                   | A |
| $t \leftarrow \text{integer round of float } u$ | 64 47 42 37 32 27 ,21 0                                                                                                                                                           |   |
| (INT_ROUND_TEST t u)                            | $t u 18 0D 09 \dots$                                                                                                                                                              |   |
| $t \leftarrow \text{integer round of float } u$ | 64 47 42 37 32 27 21 0                                                                                                                                                            | A |

These operations convert floats into signed integers. The roundings are directed as in IEEE Standard 754. INT\_ROUND uses the rounding mode in the SSW.

A float invalid exception is raised when the result is not a representable signed integer. In these cases the result is reduced modulo  $2^{64}$ .

The \_TEST versions of these operations never generate carry or overflow/NaN.

## RAISES

float\_invalid, float\_inexact

## SEE ALSO

FLOAT\_INT, FLOAT\_ UNS\_

(INT\_ADD 
$$t$$
  $u$   $v$ )
$$t - u + v$$
, integer

(INT\_ADD  $x$   $y$   $z$ )
$$x \leftarrow y + z$$
, integer

(INT\_ADD\_TEST  $t$   $u$   $v$ )
$$t - u + v$$
, integer

(INT\_ADD\_TEST  $t$   $u$   $v$ )
$$t \leftarrow u + v$$
, integer

(INT\_ADD\_TEST  $t$   $u$   $v$ )
$$t \leftarrow u + v$$
, integer

(INT\_ADD\_TEST  $t$   $u$   $v$ )
$$t \leftarrow v$$

These operations perform two's-complement and unsigned integer addition.

The resulting condition code from the \_TEST version of this operation has its two's-complement definition.

RAISES
(nothing)
SEE ALSO
INT\_ADD\_IMM

INT\_ADD\_

```
t_{42} = 0 bvalue 20 ... 0
  (INT_ADD_IMM t u \ value)
                                t - u + value. integer
                                               {where value \in [1...512], bvalue = value - 1}
                                                                                                                                                                                                                                                                                                                                        x_{11}, \dots, x_{11}, y_{11}, bvalue_{11}, 04_{11}
 (INT_ADD_IMM x y value)
                                x - y + value. integer
                                               {where value \in [1...32], bvalue = value - 1}
                                                                                                                                                                                                                                                                                                 t_{04} = t
 (INT_ADD_IMM_TEST t u value)
                               t - u + 'value, integer
                                             {where value \in [1...512], bvalue = value - 1}
(INT_ADD_IMM_TEST x y value)
                                                                                                                                                                                                                                                                                                                                       x, y, bvalue 05
                                                                                                                                                                                                                                                                                                                                                                                                                                                  C
                               x - y + value, integer
                                             {where value \in [1...32], bvalue = value - 1}
```

These operations effectively add a constant between 1 and 32(512) to y(u), storing it in x(t).

The resulting condition code from the \_TEST version of this operation has its two's-complement definition.

```
RAISES
```

(nothing)

SEE ALSO

INT\_ADD, INT\_SUB\_IMM

(INT\_ADD\_MUL 
$$t \ u \ v \ w$$
)
$$t - u + v * w, \text{ integer}$$
(INT\_ADD\_MUL\_TEST  $t \ u \ v \ w$ )
$$t - u + v * w, \text{ integer}$$

$$t - u + v * w, \text{ integer}$$

These operations perform two's-complement multiplication and addition. A multiply is accomplished by letting u be register 0.

The \_TEST versions of these operations never generate carry or overflow/NaN, despite the fact that the multiply or the add might overflow.

If v or w is outside  $[-2^{53} cdots 2^{53} - 1]$ , the float\_extension exception is raised and the result in t may be incorrect.

RAISES

float\_extension

SEE ALSO

UNS\_ADD\_MUL\_UPPER, INT\_ADD, INT\_SUB\_MUL, INT\_SUB\_MUL\_REV

INT\_ADD\_MUL\_

These operations are the last step in integer division. The product of the integer v and SpecialFloat64 w is shifted right according to the exponent of w and rounded, producing an integer.

The \_TEST versions of these operations generate carry when the quotient is not exact, i.e. when the division by  $2^{-exp}$  yields a non-zero remainder.

If v is outside  $[-2^{53} \dots 2^{53} - 1]$ , the float\_extension exception is raised and the result in t may be incorrect.

Although register u is not used in the current hardware implementation, the software requires u to contain the denominator in order to properly handle float\_extension exceptions.

RAISES
float\_extension
SEE ALSO

UNS\_DIV, §12.6

```
(INT_FETCH_ADD_AC_DISP r s ac disp)
                                                                                                                                                                                                        r s C ac ac sdisp 11
                                                                                                                                                                                                                                                                                                                               MC
                         temp — (word at s + 'disp \mod 2^{48});
                         (word at s + 'disp \mod 2^{48}) — r + (word at <math>s + 'disp \mod 2^{48}), with 'ac;
                                   {where 'disp \in [0...16383], 'sdisp = 'disp/8}
  (INT_FETCH_ADD_AC_INDEX r s ac y)
                                                                                                                                                                                                      s_1 \cdots s_n r_n s_n F_1 \cdots ac_n y_n 10_39_n
                                                                                                                                                                                                                                                                                                                             MC
                       temp — (word at s + 8 * y \mod 2^{48});
                       (word at s + 8 * y \mod 2^{48}) — r + (\text{word at } s + 8 * y \mod 2^{48}), with 'ac;
                       r \leftarrow temp
 (INT_FETCH_ADD_DISP r s disp)
                                                                                                                                                                                                                 r_{56} \cdots r_{56} c_{51} c_{47} \cdots sdisp_{5} 10_{0}
                                                                                                                                                                                                                                                                                                                            MC
                       temp — (word at s + 'disp \mod 2^{48});
                      (word at s + 'disp \mod 2^{48}) — r + (word at <math>s + 'disp \mod 2^{48});
                      r \leftarrow temp
                               {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(INT_FETCH_ADD_INDEX r s y)
                                                                                                                                                                                                  r_{64} \cdots r_{61} r_{56} r_{51} r_{47} \cdots r_{21} r_{16} r_{11} r_{63} r_{64} r_{64} r_{64} r_{65} r_{
                                                                                                                                                                                                                                                                                                                           MC
                     temp - (word at s + 8 * y \mod 2^{48});
                    (word at s + 8 * y \mod 2^{48}) - r + (\text{word at } s + 8 * y \mod 2^{48});
                    \tau \leftarrow temp:
```

These operations generally behave like a LOAD operation followed by a STORE operation with respect to access control. If ac is present, it is used; otherwise the access control field of s is used.

## RAISES

data\_hw\_error. data\_prot, data\_alignment, data\_blocked
COUNTS AS
CNT\_INT\_FETCH\_ADD
SEE ALSO
LOAD, INT\_ADD

INT\_FETCH\_ADD\_

```
(INT_IMM t value)
t = value 
\{ where value \in [-2^{14} \dots 2^{14} - 1] \}
```

This operation loads a signed immediate constant into register t.

RAISES
(nothing)
SEE ALSO
BIT\_MASK

These operations load a signed byte from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used: otherwise those of s are used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STOREB, UNS\_LOADB

INT\_LOADB\_

```
(INT_LOADH r s)
                                                                                                                                                                                                                                                                                                 r \ s \ 1 \ \ldots \ r \ s \ 1 \ \ldots \ 0
                           r — sign extend(halfword at s), with FE_NORMAL
 (INT_LOADH_AC_DISP r s ac disp)
                                                                                                                                                                                                                                         r \in C ac sdisp 9
                           r — sign extend(halfword at s + 'disp mod 2^{48}), with 'ac
                                       {where 'disp \in [0...16383], 'sdisp = 'disp/4}
(INT_LOADH_AC_INDEX r s ac y)
                                                                                                                                                                                                                                    r_{64} \dots r_{51} r_{56} r_{51} r_{47} \dots r_{21} r_{16} r_{11} r_{16} r_{10}
                                                                                                                                                                                                                                                                                                                                                                             MC
                          r \leftarrow \text{sign extend(halfword at } s + 4 * y \mod 2^{48}), \text{ with } ac
(INT_LOADH_DISP r s disp)
                                                                                                                                                                                                                                                      r_{64} \dots r_{50} C_{1056} S_{1057} S_{
                                                                                                                                                                                                                                                                                                                                                                            MC
                         r \leftarrow \text{sign extend(halfword at } s + 'disp \mod 2^{48}), \text{ with FE_NORMAL}
                                      {where idisp \in [0...524287], isdisp = idisp/4}
                                                                                                                                                                                                                                  r_{64} \cdots r_{61} s_{56} s_{51} s_{47} \cdots s_{21} 00 y_{16} s_{11} s_{60}
(INT\_LOADH\_INDEX \ r \ s \ y)
                                                                                                                                                                                                                                                                                                                                                                            MC
                         r — sign extend(halfword at s + 4 * y \mod 2^{48}), with FE_NORMAL
```

These operations load a signed halfword from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0. and data trap1 disable bits are used; otherwise those of s are used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STOREH, UNS\_LOADH

 $(INT\_LOADQ r s)$ 

 $_{64} \cdots _{61} ^{r} _{56} ^{s} _{51} ^{2} _{47} \cdots _{0} M.$ 

r - sign extend(quarterword at s), with FE\_NORMAL

(INT\_LOADQ\_AC\_DISP r s ac disp)

64..., r 5 C ... ac sdisp 5 MC

r — sign extend(quarterword at s + 'disp mod  $2^{48}$ ), with 'ac {where 'disp  $\in [0...16383]$ , 'sdisp = 'disp/2}

(INT\_LOADQ\_AC\_INDEX r s ac y)

 $r_{64} \cdots r_{61} s_{56} s_{51} r_{47} \cdots r_{21} a c_{16} y_{11} 14_{53} g_{0}$  MC

 $r - \text{sign extend}(\text{quarterword at } s + 2 * y \mod 2^{48}), \text{ with 'ac}$ 

(INT\_LOADQ\_DISP r s disp)

 $r_{64} \cdots r_{61} s_{36} c_{51} c_{47} \cdots s_{21} sdisp_{31} d_{10}$  MC

r — sign extend(quarterword at  $s + 'disp \mod 2^{48}$ ), with FE\_NORMAL {where 'disp  $\in [0...524287]$ , 'sdisp = 'disp/2}

(INT\_LOADQ\_INDEX r s v)

 $r_{64} \cdots r_{61} s_{65} s_{51} s_{47} \cdots s_{21} s_{16} y_{11} s_{60} s_{16} MC$ 

r — sign extend(quarterword at  $s + 2 * y \mod 2^{48}$ ), with FE\_NORMAL

These operations load a signed quarterword from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used; otherwise those of s are used.

RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STOREQ, UNS\_LOADQ

INT\_LOADQ\_

These operations determine the floor of the base 2 logarithm of a floating-point number.

For denormalized y, x will take on values less than the minimum exponent so that scalb(x, -logb(x)) is always less than two and greater than or equal to one. When y is infinity or NaN, the maximum positive integer is returned. When y is zero, the minimum negative integer is returned.

INT\_LOGB\_TEST never generates overflow/NaN, and generates carry if y is infinity, NaN or zero.

# **RAISES**

(nothing)

(INT\_MAX 
$$t \ u \ v$$
)
$$t - \max(u, v), \text{ integer}$$
(INT\_MAX\_TEST  $t \ u \ v$ )
$$t - \max(u, v), \text{ integer}$$

$$t - \max(u, v), \text{ integer}$$

These operations select the larger of the two integer operands.

INT\_MAX\_TEST never generates overflow/NaN, and generates carry if u is selected, meaning  $u \ge v$ .

**RAISES** 

(nothing)

SEE ALSO

INT\_MIN, SELECT\_INT, FLOAT\_MAX

INT\_MAX\_

(INT\_MEM\_ADD\_AC\_DISP 
$$r$$
 s ac  $disp$ )

(word at  $s + idisp \mod 2^{46}$ ) —  $r + (\text{word at } s + idisp \mod 2^{46})$ , with  $ac$ :

{where  $idisp \in [0 \dots 16383]$ ,  $isdisp = idisp/8$ }

(INT\_MEM\_ADD\_AC\_INDEX  $r$  s ac  $y$ )

(word at  $s + 8 * y \mod 2^{48}$ ) —  $r + (\text{word at } s + 8 * y \mod 2^{48})$ , with  $ac$ :

(INT\_MEM\_ADD\_DISP  $ac$  s  $ac$  s)

(word at  $ac$  s + 8 \* y  $ac$  s  $ac$  s)

(word at  $ac$  s + 8 \* y  $ac$  s  $ac$  s)

(word at  $ac$  s + 8 \* y  $ac$  s  $ac$  s)

(word at  $ac$  s + 3 \* y  $ac$  s  $a$ 

These operations generally behave like a LOAD operation followed by a STORE operation with respect to access control. Unlike INT\_FETCH\_ADD, the r register is not modified. If ac is present, it is used; otherwise the access control field of s is used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked SEE ALSO

LOAD, INT\_ADD

(INT\_MIN 
$$t \ u \ v$$
)
$$t \leftarrow \min(u, v), \text{ integer}$$

$$t \leftarrow \min(u, v), \text{ integer}$$
(INT\_MIN\_TEST  $t \ u \ v$ )
$$t \leftarrow \min(u, v), \text{ integer}$$

$$t \leftarrow \min(u, v), \text{ integer}$$

$$t \leftarrow \min(u, v), \text{ integer}$$

These operations select the smaller of the two integer operands. INT\_MIN\_TEST never generates overflow/NaN, and generates carry if v is selected, meaning  $u \ge v$ .

RAISES (nothing)

SEE ALSO

INT\_MAX. SELECT\_INT, FLOAT\_MIN

INT\_MIN\_

# (INT\_RECIP\_APPROX x y)

$$z_{64} \cdots z_{1116} y_{11} 08_{6} 00_{0}$$
 C

x — an approximation to 1/y, floating point

This operation is used to compute an integer reciprocal. It performs a table lookup operation, followed by a linear interpolation using an adder-multiplier. The result is returned as a SpecialFloat64. If the y operand is zero, the float\_extension exception is raised.

**RAISES** 

float\_extension

SEE ALSO

FLOAT\_RECIP\_APPROX, §12.6

$$t_{1} \cdots t_{1} t_$$

t - 1.0 - v \* w, floating point, round to nearest

This operation is used to perform integer division. The  $ERROR_{-}$  operations perform a partial Newton's method iteration using the adder-multiplier. Note that v is a Float64, while w is used as a SpecialFloat64 and t is returned as a Float64.

**RAISES** 

(nothing)

SEE ALSO

INT\_DIV\_CHOP, INT\_DIV\_FLOOR, INT\_RECIP\_APPROX, §12.6

INT\_RECIP\_ERROR

(INT\_RECIP\_SHIFT 
$$x$$
  $y$ )
$$x - log_2abs(y), \text{ round to ceiling}$$
(INT\_RECIP\_SHIFT\_TEST  $x$   $y$ )
$$x - log_2abs(y), \text{ round to ceiling}$$

$$(z)$$

$$x - log_2abs(y), \text{ round to ceiling}$$

These operations are used to compute integer reciprocals. They compute the ceiling of the base 2 logarithm of the absolute value of y. When y is zero, x is set to -1.

RAISES

(nothing)

SEE ALSO

INT\_DIV\_CHOP, §12.6

(INT\_RSQRT\_APPROX z y)

 $x_{64} \cdots x_{21} x_{16} y_{11} 09_{6} 00_{0}$  C

x — an approximation to  $1/\sqrt{y}$ , floating point

This operation is used to compute the reciprocal square root of a denormalized number. It performs a table lookup operation, followed by a linear interpolation using an adder-multiplier. The result is returned as a SpecialFloat64. If the absolute value of y is outside the range  $2^{64}$  to 1, it is effectively scaled into that range.

RAISES

(nothing)

SEE ALSO

FLOAT\_RSQRT\_APPROX. §12.5

INT\_RSQRT\_APPROX\_

(INT\_SHIFT\_RIGHT 
$$t \ v \ w$$
)
$$t - v \gg_a w$$
(INT\_SHIFT\_RIGHT  $x \ y \ z$ )
$$x - y \gg_a z$$
(INT\_SHIFT\_RIGHT\_TEST  $t \ v \ w$ )
$$t - v \gg_a w$$
(INT\_SHIFT\_RIGHT\_TEST  $t \ v \ w$ )
$$t - v \gg_a w$$
(INT\_SHIFT\_RIGHT\_TEST  $t \ v \ w$ )
$$t - v \gg_a w$$
(INT\_SHIFT\_RIGHT\_TEST  $t \ v \ w$ )
$$t - v \gg_a w$$
(INT\_SHIFT\_RIGHT\_TEST  $t \ v \ v \ z$ )
$$t - v \gg_a x$$

These operations do an arithmetic shift right, filling bits on the left with copies of the sign bit. Unsigned shift counts in w/z are taken modulo 64.

The \_TEST version generates carry if a 1-bit is shifted out of v or y and never generates overflow/NaN.

RAISES

(nothing)

SEE ALSO

UNS\_SHIFT\_RIGHT, SHIFT\_PAIR\_RIGHT, SHIFT\_LEFT, INT\_DIV\_FLOOR

(INT\_SUB 
$$t u v$$
)
$$t \leftarrow u - v, \text{ integer}$$

$$(INT_SUB x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST t u v)$$

$$t \leftarrow u - v, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

$$x - y - z, \text{ integer}$$

$$(INT_SUB_TEST x y z)$$

These operations do two's-complement integer and unsigned subtraction.

The resulting condition code from the \_TEST version of this operation has its two's-complement definition.

RAISES

(nothing)

SEE ALSO

INT\_SUB\_IMM, UNS\_SUB\_CARRY\_TEST

INT\_SUB\_

```
(INT_SUB_IMM t u value)
                                                                                                                                                                                                                                                                                                               t_{47,42} = t_{37,36} = t_{27,31} = t_{2
                                 t - u - value, integer
                                                 {where value \in [1...512]. bvalue = value - 1}
 (INT_SUB_IMM x y value)
                                                                                                                                                                                                                                                                                                                                                  x_1 y_1 bvalue_6 06_0
                                x - y - value, integer
                                               {where 'value \in [1 \dots 32], 'bvalue = 'value - 1}
 (INT_SUB_IMM_TEST t u value)
                                                                                                                                                                                                                                                                                                             t_{47} u_{47} t_{42} t_{37} t_{36} bvalue_{27} t_{21} \dots
                               t \leftarrow u - \text{'value, integer}
                                               {where 'value \in [1...512], 'bvalue = 'value - 1}
(INT_SUB_IMM_TEST x y value)
                                                                                                                                                                                                                                                                                                                                                x_{64} \dots x_{21} y_{11} bvalue_{6} 07_{0}
                               x \leftarrow y - 'value, integer
                                              {where value \in [1...32], bvalue = value - 1}
```

These operations effectively subtract a constant between 1 and 32(512) to y(u), storing it in x(t). The resulting condition code from the \_TEST version of this operation has its two's-complement definition.

```
RAISES
```

(nothing)

SEE ALSO

INT\_SUB\_IMM, INT\_ADD\_IMM

(INT\_SUB\_MUL 
$$t \ u \ v \ w$$
)
$$t - u - v * w . integer$$
(INT\_SUB\_MUL\_TEST  $t \ u \ v \ w$ )
$$t - u - v * w . integer$$

$$(INT_SUB_MUL_REV \ t \ u \ v \ w)$$

$$t - u + v * w . integer$$
(INT\_SUB\_MUL\_REV\_TEST  $t \ u \ v \ w$ )
$$t - - u + v * w . integer$$
(INT\_SUB\_MUL\_REV\_TEST  $t \ u \ v \ w$ )
$$t - - u + v * w . integer$$

$$t - - u + v * w . integer$$

$$t - - u + v * w . integer$$

These operations do two's-complement multiplication followed by subtraction.

The \_TEST versions of these operations never generate carry or overflow/NaN, despite the fact that the multiply or the add might overflow.

If v or w is outside  $[-2^{53} \dots 2^{53} - 1]$ , the float\_extension exception is raised.

RAISES

float\_extension

SEE ALSO

INT\_SUB. INT\_ADD\_MUL

INT\_SUB\_MUL\_

```
(JUMP mask cn tn)
                                                            \dots mask cn F 1 6 tn
      if cv_{cn} \in mask then
          SSW.pc - TN:
      end
(JUMP_OFTEN mask cn tn)
                                                            \dots mask cn F 0.7 tn
      if CV_{cn} \in mask then
         SSW.pc - TN;
      else
         stop lookahead:
     end
(JUMP_SELDOM mask cn tn)
                                                           \dots mask cn F 1 7 tn
     if CV_{cn} \in mask then
         SSW.pc - TN;
         stop lookahead;
     end
```

These operations do conditional branches if the selected condition code is a member of the condition mask. If taken, the branch goes to the location previously loaded into target register TN by a TARGET operation. The value mask is an eight-bit condition mask of type CondMask, described in §4. If mask is empty, then the test is false.

The trap mask SSW.tm, mode bits SSW.md. and condition vector CV are unaffected.

A JUMP\_OFTEN operation is intended for branches that will be taken frequently, such as the backwards branch in a loop or a branch over an error condition. Lookahead is stopped if the JUMP\_OFTEN fails. Lookahead is stopped if the JUMP\_SELDOM succeeds.

JUMP\_OFTEN and JUMP\_SELDOM can be monitored via CNT\_JUMP\_EXPECTED and CNT\_JUMP\_UNEXPECTED to observe branch prediction accuracy. JUMP only counts toward CNT\_TRANSFER\_TOTAL.

```
RAISES
```

(nothing)

COUNTS AS

CNT\_JUMP\_EXPECTED if expected path taken, CNT\_JUMP\_UNEXPECTED if unexpected path taken, CNT\_TRANSFER\_TOTAL

SEE ALSO

SKIP

Jump Operations , JUMP\_

```
(LEVEL_ENTER lev)
                                                                        _{64}\dots_{21}_{16}_{16}_{13}_{11}_{7}_{6}_{6}_{3}_{0}
                                                                                                         C
       if (LEVEL = lev) then
            LEVEL — program map execute protection level:
            SSW.ssw_override — true:
            suppress program protection exception
            raise program protection exception
       end
(LEVEL_RTN lev tn)
                                                                      _{64} \cdots _{21} 00_{16} 01_{13} lev_{11} F_{7} 0_{6} 6_{3} tn_{0}
                                                                                                        C
      if LEVEL \geq 'lev then
           SSW.pc - tn;
           SSW.ssw_override - false;
           LEVEL - 'lev,
           raise privileged operation exception
      end
```

These operations change the privilege level of a stream (see §8.1). The LEVEL\_ENTER operation is normally placed at privileged entry points, with a matching LEVEL\_RTN at the exit. If a LEVEL\_ENTER is executed from the wrong privilege level a program protection exception will be raised, whether the privilege level of the stream matched the program map execute protection level or not. If a LEVEL\_RTN attempts to raise the privilege level, a privileged operation exception will be raised.

Lookahead is disabled when LEVEL\_ENTER sets ssw\_override. However, lookahead beyond a LEVEL\_ENTER still may result in lost exception detail. Lookahead is not disabled for LEVEL\_RTN.

```
RAISES

privileged, prog_prot

COUNTS AS

CNT_LEVEL if LEVEL_ENTER

SEE ALSO

DOMAIN_LEAVE

LEVEL.
```

(LOAD r s)

$$r = (\text{word at } s)$$
, with FE\_NORMAL

(LOAD\_AC\_DISP r s ac disp)

 $r = (\text{word at } s + \text{idisp mod } 2^{48})$ , with ac

 $\{\text{where idisp} \in [0 \dots 16383], \text{isdisp} = \text{idisp/8}\}$ 

(LOAD\_AC\_INDEX r s ac y)

 $r = (\text{word at } s + 8 * y \mod 2^{48})$ , with ac

(LOAD\_DISP r s disp)

 $r = (\text{word at } s + \text{idisp mod } 2^{48})$ , with ac

(LOAD\_DISP r s disp)

 $r = (\text{word at } s + \text{idisp mod } 2^{48})$ , with FE\_NORMAL

 $\{\text{where idisp} \in [0 \dots 524287], \text{isdisp} = \text{idisp/8}\}$ 

(LOAD\_INDEX r s y)

 $r = (\text{word at } s + 8 * y \mod 2^{48})$ , with FE\_NORMAL

 $\{\text{where idisp} \in [0 \dots 524287], \text{isdisp} = \text{idisp/8}\}$ 

(LOAD\_INDEX r s y)

 $r = (\text{word at } s + 8 * y \mod 2^{48})$ , with FE\_NORMAL

These operations load a word from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used: otherwise those of s are used. They are used to load floating-point numbers and 64-bit signed and unsigned integers.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STORE

Jump Operations LOAD\_

(LOAD\_FE 
$$rs$$
)
$$r \leftarrow (\text{word at } s)$$

$$M$$

This operation loads a word from memory, obeying the fe\_control in the pointer. It is used to load floating-point numbers, and 64-bit signed and unsigned integers. This operation allows synchronizing loads to be performed without using explicit access control in the operation.

RAISES

data\_hw\_error. data\_prot, data\_alignment, data\_blocked
COUNTS AS
CNT\_LOAD
SEE ALSO
STORE.PTR\_SET\_AC

LOAD\_FE\_

(LOGICAL\_ALLONE 
$$t$$
 mask  $cn$ )
$$t - \text{if } cv_{cn} \in \text{'mask then } -1 \text{ else } 0 \text{ end}$$

$$(LOGICAL_ALLONE\_TEST \ t \text{ mask } cn)$$

$$t - \text{if } cv_{cn} \in \text{'mask then } -1 \text{ else } 0 \text{ end}$$

$$(LOGICAL_ALLONE\_TEST \ t \text{ mask } cn)$$

$$t - \text{if } cv_{cn} \in \text{'mask then } -1 \text{ else } 0 \text{ end}$$

These operations convert a condition code into a logical value in all the bits in t. In contrast, LOGICAL\_ONE produces a single-bit logical value.

The fields mask and cn together refer to  $CV_{cn}$ , a condition code that was previously generated. The value mask is an eight-bit wide condition mask of type CondMask, described in §4.

The resulting condition code from the \_TEST version of this operation has its two's-complement' definition.

```
RAISES
(nothing)
SEE ALSO
LOGICAL_ONE, SELECT_INT
```

(LOGICAL\_ONE 
$$t$$
 mask  $cn$ )
$$t - \text{if } cv_{cn} \in \text{mask then 1 else 0 end}$$

$$t - \text{if } cv_{cn} \in \text{mask then 1 else 0 end}$$

$$t - \text{if } cv_{cn} \in \text{mask then 1 else 0 end}$$

$$t - \text{if } cv_{cn} \in \text{mask then 1 else 0 end}$$

$$A$$

These operations convert a condition code into a single-bit logical value in bit 0 of t. In contrast, LOGICAL\_ALLONE produces a full-word logical value.

The fields mask and cn together refer to  $CV_{cn}$ , a condition code that was previously generated. The value mask is an eight-bit wide condition mask of type CondMask, described in §4.

The resulting condition code from the \_TEST version of this operation has its two's-complement definition.

```
RAISES
(nothing)
SEE ALSO
LOGICAL_ALLONE, SELECT_INT
LOGICAL_ONE_
```

(NOP) .....00 00 19 00 C

These operations do nothing.

The M-unit NOP is encoded as a UNS\_LOADQ into register r0 from register r0. The A-unit NOP is encoded as an INT\_IMM into register r0. The C-unit NOP is encoded as a CLOCK into register r0.

**RAISES** 

(nothing)

COUNTS AS

M-unit NOP as CNT\_M\_NOP; A-unit NOP as CNT\_A\_NOP; C-unit NOP as CNT\_C\_NOP

```
(PROBE_DISP r s lev access disp)
                                                    64 ... 1 56 51 F ... lev access 0 sdisp 11
                                                                                                     MC
       maplevel — level required for 'access at s + 'disp;
       if (LEVEL < maplevel) and ('lev > LEVEL) then
             raise data protection level exception
       else if (min(lev. LEVEL) \ge maplevel) and (s \text{ has proper access control}) then
            r — pointer to the last byte in the segment
       else
            r \leftarrow 0
       end
          {where 'disp \in [0...16383], 'sdisp = 'disp/8}
(PROBE_INDEX r s lev access y)
                                                   r_{64} \cdots r_{61} s_{56} s_{51} s_{47} \cdots s_{21} lev_{19} access 0 y_{16} 00_{16} 39_{10}
                                                                                                    MC
      maplevel — level required for 'access at s + 8 * y;
      if (LEVEL < maplevel) and ('lev > LEVEL) then
            raise data protection level exception
      else if (min('lev.LEVEL) \ge maplevel) and (s \text{ has proper access control}) then
           r — pointer to the last byte in the segment
      else
           r \leftarrow 0
      end
```

These operations are intended for checking the validity of address parameters passed from routines at lower protection levels to routines at higher protection levels. The protection level to check privilege is given by 'lev. It is specified by a member of the Level enumeration, such as LEV\_USER (see §8.1). If the pointer s lies beyond the map limit for this domain, the map level is set to LEV\_IPL. The kind of access check given by 'access is one of the following codes:

| Name         | Value | Meaning                                         |
|--------------|-------|-------------------------------------------------|
| ProbeControl |       |                                                 |
| P_READ       | 0     | check if the address is mapped for reading      |
| P_MODIFY     | 1     | check if the address is mapped for modification |

Besides checking whether the addressed location can be read or written at the given privilege level, these operations make sure that forwarding, trapping, and memory full bit testing are all disabled in the pointer s. The pointer returned in r has the same access control field as s.

```
RAISES

data_prot
SEE ALSO
DATA_MAP_
```

ProbeControl

flush (program instruction at u) from program instruction cache

flush any (program instruction at u) from program instruction cache

$$A = \begin{bmatrix} 0 & 0 & u & 00 & 05 & 0A & \dots & A \end{bmatrix}$$

flush any (program instruction at u) from L1 program instruction cache

These are supervisor-privileged operations to maintain consistency in the program instruction caches.

The address in u is a physical word offset into local memory. The PROGRAM\_CACHE\_FLUSH operation is used to flush a single page of instructions, as after storing new data in a page frame. The PROGRAM\_CACHE\_FLUSH\_ANY operation is used to flush any entries from the caches, presumably only during system initialization. Since the cache is not fully associative, multiple flushes may be required—each flush should specify a different cache line. See §7.2. The PROGRAM\_CACHE\_FLUSH\_L1 operation flushes the L1 cache only. This restriction allows multiple pages to be flushed from L1 more quickly without disturbing the L2 cache.

RAISES privileged

SEE ALSO

PROGRAM\_MAP\_

# $(PROGRAM\_MAP\_FLUSH u)$

 $0.010^{-0.0} \times 0.010^{-0.0} \times 0.01$ 

flush (program map at u) from program map cache

(PROGRAM\_MAP\_FLUSH\_ANY u)

 $a_{41} \cdots a_{70} a_{44} a_{22} a_{37} a_{32} a_{27} a_{21} a_{21} a_{32}$  A

flush any (program map at u) from program map cache

These are supervisor-privileged operations to maintain consistency in the program address translation cache.

The Bits 63-60 and Bits 31-12 of u address the program map cache entry; other bits of u are ignored. A violation of the map limit will not raise a map limit exception. The PROGRAM\_MAP\_FLUSH operation is used to flush a single map entry, as after changing the program map in I/O memory. The PROGRAM\_MAP\_FLUSH\_ANY operation is used to flush any map entry for the given domain from the cache. Since the cache is not fully associative, up to 128 flushes may be required; each flush should specify a different page modulo 128. See §7.1.

RAISES

privileged

SEE ALSO

DATA\_MAP\_

PROGRAM\_MAP\_

# (PROGRAM\_STATE\_RESTORE u)

$$A = \begin{bmatrix} 0 & 0 & u & 00 & 00 & 0A & ... & A \\ 47 & 44 & 42 & 37 & 32 & 27 & 21 & 0 \end{bmatrix}$$

(program state descriptor for domain in  $u_-63 - 60$ ) — u

This supervisor-privileged operation is used to set the program state descriptor.

The register u contains a program state descriptor: see §8.5. The Bits 63-60 of u specify the protection domain, i.e. the program state descriptor is self-tagged.

RAISES

privileged

SEE ALSO

DATA\_STATE

These operations modify the access control field of pointers. That is, if the resulting pointer  $(t \lor x)$  is used without accompanying access control field in a subsequent memory reference operation, then the effect will be as if  $u \lor y$  and ac had been used instead.

RAISES

(nothing)

SEE ALSO

LOAD\_FE.STORE

PTR\_SET\_

 $t_{64} \dots t_{1742} = t_{2722} = t_{2722} = t_{21} = t_{10}$  A

t — rounded float v

This operation is used to do rounding conversions from 64-bit floating point to 32-bit floating point. The inverse of this operation is FLOAT\_REAL.

If v is NaN, t is that NaN after rounding the least significant bits from the significand and or ing in a one at bit 3 to preserve NaN identity.

RAISES

float\_overflow, float\_underflow, float\_inexact

SEE ALSO

FLOAT\_REAL

```
(REG_LOAD_AC_DISP r s ac disp)
                                                                                r_{64} \cdots r_{56} s_{51} s_{14} \cdots s_{21} ac_{16} sdisp_{01}
                                                                                                                               MC
         r - (\text{word at } s + \text{'disp}), \text{ with 'ac};
         (poison at r) — (full at s + idisp)
             {where 'disp \in [0...16383], 'sdisp = 'disp/8}
(REG_LOAD_AC_INDEX r s ac y)
                                                                               r_{61} \cdot \cdot \cdot r_{61} \cdot s_{65} \cdot s_{147} \cdot \cdot \cdot \cdot r_{21} \cdot ac_{15} \cdot y_{11} \cdot 09_{69} \cdot 39_{69}
                                                                                                                               MC
         r \leftarrow (\text{word at } s + 8 * y), \text{ with } ac;
        (poison at 'r) — (full at s + 8 * y)
(REG_LOAD_DISP r s disp)
                                                                                    r s D \dots sdisp OO
                                                                                                                               MC
        r \leftarrow (\text{word at } s + 'disp)
        (poison at 'r) — (full at s + 'disp)
            {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(REG_LOAD_INDEX r s y)
                                                                              r_{64} \cdots r_{61} r_{56} r_{51} r_{47} \cdots r_{21} r_{16} r_{11} r_{11} r_{60}
                                                                                                                              MC
        \tau = (\text{word at } s + 8 * v)
        (poison at 'r) — (full at s + 8 * y)
```

These operations load the word and the memory full bit of the access state ( $\S6.1$ ) from the addressed memory cell. The word is stored in register r and the memory full bit is stored in the poison bit for register r.

These operations are only subject to the trapping and forwarding normally controlled by the access state of the addressed memory location. If ac is present, its forward, data trap0 and data trap1 disable bits are used; otherwise those of s are used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

REG\_STORE. REG\_MOVE

REGLOAD\_

(REG\_MOVE 
$$t$$
  $v$ )
$$t - v;$$
(poison at  $t$ ) — (poison at  $t$ )

(REG\_MOVE  $t$   $t$ )
$$t - v;$$
(poison at  $t$ ) — (poison at  $t$ )
$$t - v;$$
(poison at  $t$ ) — (poison at  $t$ )

These operations copy data from one register to another, without raising a poison exception.

**RAISES** 

(nothing)

```
(REG_STORE_AC_DISP r s ac disp)
                                                                          _{64} \cdots _{61} ^{7} _{56} ^{5} _{51} ^{F} _{47} \cdots _{21} ac_{16} sdisp.01
                                                                                                                       MC
         (word at s + idisp) — r. with ac:
         (full at s + idisp) — (poison at r)
             {where disp \in [0...16383], sdisp = disp/8}
 (REG_STORE_AC_INDEX r s ac y)
                                                                          r_{61} \cdots r_{61} s_{65} s_{51} s_{47} \cdots r_{21} a c_{16} y_{10} 01_{63}
                                                                                                                      MC
         (word at s + 8 * y) - r, with 'ac;
        (full at s + 8 * y) — (poison at 'r)
(REG_STORE_DISP r s disp)
                                                                              _{64} \cdots _{61}^{r} _{56}^{s} _{51}^{F} _{47} \cdots _{21}^{s} sdisp_{500}
                                                                                                                      MC
        (word at s + 'disp) \leftarrow r
        (full at s + 'disp) — (poison at 'r)
            {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(REG_STORE_INDEX r s y)
                                                                       r_{64} \cdots r_{61} s_{65} s_{51} s_{47} \cdots s_{21} 00 y_{16} s_{11} 01_{6} 38_{0}
                                                                                                                      MC
        (word at s + 8 * y) — r
       (full at s + 8 * y) — (poison at 'r)
```

These operations store both the memory full bit of the access state( $\S6.1$ ) and the value in the addressed memory cell. The value comes from register r. The memory full bit comes from the poison bit associated with register r, complementing the action of a REG\_LOAD operation. Thus, these operations are not subject to a poison exception due to r.

These operations are only subject to the trapping or forwarding normally controlled by the access state of the addressed memory location. If ac is present, its forward, data trap0 and data trap1 disable bits are used; otherwise those of s are used.

```
RAISES
```

data\_hw\_error, data\_prot, data\_alignment. data\_blocked

COUNTS AS

CNT\_STORE

SEE ALSO

REGLOAD. REGLMOVE

REG\_STORE\_

# (RESULTCODE\_SAVE z)

 $x_{64} \dots x_{16} = 0 1000$  C

x - RESULTCODE - lookahead index RESULTCODE - 0

This operation saves and clears the result code register, described in  $\S 9.1$ . The value of RESULTCODE is undefined after an instruction combining RESULTCODE\_SAVE with a floating-point A-operation. The data resultcodes are rotated so that dr0 corresponds to opa0.

RAISES

(nothing)

SEE ALSO

EXCEPTION\_, DATA\_OPA\_SAVE

Rotate Operations

RESULTCODE\_

(ROTATE\_LEFT 
$$z y z$$
)
$$z - y \rightarrow z$$
(ROTATE\_LEFT\_TEST  $z y z$ )
$$z - y \rightarrow z$$
(ROTATE\_RIGHT  $z y z$ )
$$z - y \rightarrow z$$
(ROTATE\_RIGHT\_TEST  $z y z$ )
$$z - y \rightarrow z$$
(ROTATE\_RIGHT\_TEST  $z y z$ )
$$z - y \rightarrow z$$
(ROTATE\_RIGHT\_TEST  $z y z$ )
$$z - y \rightarrow z$$
(ROTATE\_RIGHT\_TEST  $z y z$ )
$$z - y \rightarrow z$$

These operations rotate a word to the left or right. They compute the unsigned rotation amount z modulo 64.

The \_TEST version generates carry if a 1-bit is rotated out of one end of y and into the other end, and never generates overflow/NaN.

RAISES

(nothing)

SEE ALSO

SHIFT\_LEFT, INT\_SHIFT\_RIGHT, UNS\_SHIFT\_RIGHT, SHIFT\_PAIR

ROTATE\_

These operations are used to conditionally select the value u or v.

The masks intselect and floatselect describe the values of the condition in  $cv_{cn}$  that will select u rather than v: see §4.

The condition test intselect may be one of SEL\_CY, SEL\_EQ, SEL\_IGT, SEL\_IGE. SEL\_UGT, SEL\_UGE. SEL\_IPZ. Reversal of u and v effectively yields the additional tests SEL\_NC. SEL\_NE, SEL\_ILE, SEL\_ILT, SEL\_ULE, SEL\_ULT, SEL\_IMZ, and SEL\_IMI. The condition test floatselect may be one of SEL\_FLT. SEL\_FGE, SEL\_FGT, SEL\_FLE, or SEL\_FUN. Reversing u and v yields additional nameless conditions. Selection based on floating-point equality may use a SELECT\_INT operation with SEL\_EQ.

These operations are "lazy" in that Poison in the selected value is merely propagated to the destination. Hence, no exceptions are raised if either value is poisoned.

The \_TEST versions of these operations never generate overflow/NaN or carry.

RAISES

(nothing)

SEE ALSO

INT\_MAX, INT\_MIN, FLOAT\_MAX, FLOAT\_MIN, BIT\_MERGE

(SHIFT\_LEFT 
$$x y z$$
)
 $x - y \ll z$ 

(SHIFT\_LEFT\_TEST  $x y z$ )
 $x - y \ll z$ 

(SHIFT\_LEFT\_IMM  $t u sh$ )
 $t - u \ll 'sh$ 
 $\{ where 'sh \in [0 \dots 63] \}$ 

(SHIFT\_LEFT\_IMM\_TEST  $t u sh$ )
 $t - u \ll 'sh$ 
 $\{ where 'sh \in [0 \dots 63] \}$ 

These operations shift words to the left, filling vacated positions on the right with 0-bits. Unsigned shift counts in z are taken modulo 64.

The LTEST version generates carry if a 1-bit is shifted out of u or y, and never generates overflow/NaN.

**RAISES** 

(nothing)

SEE ALSO

UNS\_SHIFT\_RIGHT, INT\_SHIFT\_RIGHT, SHIFT\_PAIR, ROTATE\_LEFT

SHIFT\_LEFT\_

(SHIFT\_PAIR\_LEFT 
$$t \ u \ v \ w$$
)
$$t - (\text{the pair } (u, v) \ll w)/2^{64}$$
(SHIFT\_PAIR\_LEFT\_TEST  $t \ u \ v \ w$ )
$$t - (\text{the pair } (u, v) \ll w)/2^{64}$$
(SHIFT\_PAIR\_RIGHT  $t \ u \ v \ w$ )
$$t - (\text{the pair } (u, v) \ll w)/2^{64}$$
(SHIFT\_PAIR\_RIGHT  $t \ u \ v \ w$ )
$$t - (\text{the pair } (u, v) \gg w) mod 2^{64}$$
(SHIFT\_PAIR\_RIGHT\_TEST  $t \ u \ v \ w$ )
$$t - (\text{the pair } (u, v) \gg w) mod 2^{64}$$

SHIFT\_PAIR\_LEFT shifts a copy of u left and fills vacated positions with bits from the left end of v, whereas SHIFT\_PAIR\_RIGHT shifts a copy of v right and fills vacated positions with bits from the right end of u. Unsigned shift counts in w are taken modulo 64.

The SHIFT\_PAIR\_LEFT\_TEST version generates carry if those bits of u not appearing in t are not all 0, and never generates overflow/NaN. The SHIFT\_PAIR\_RIGHT\_TEST version generates carry if those bits of v not appearing in t are not all 0, and never generates overflow/NaN.

RAISES

(nothing)

SEE ALSO

SHIFT\_LEFT, ROTATE\_LEFT, UNS\_SHIFT\_RIGHT, INT\_SHIFT\_RIGHT

Skip Operations

```
(SKIP mask cn offset)
                                                                            ... mask cn hi 1 6 lo
                                                                                                              C
        if CV_{cn} \in mask then
             ssw.pc - ssw.pc + (foffset + 1)
       end
           {where offset \in [0...119], io = offset \mod 8, ihi = \lfloor offset/8 \rfloor}
(SKIP_OFTEN mask cn offset)
                                                                            a_{64} \dots a_{21} mask_{13} cn_{11} hi_{7} 0_{6} 0_{3} lo_{0}
                                                                                                              C
       if CV_{cn} \in mask then
            ssw.pc \leftarrow ssw.pc + ('offset + 1)
       else
            stop lookahead
       end
          {where 'offset \in [0...119], 'lo = 'offset mod 8, 'hi = \lfloor 'offset/8 \rfloor}
(SKIP_SELDOM mask cn offset)
                                                                           \dots mask cn hi 1 7 lo
       if cv<sub>cn</sub> ∈ 'mask then
            ssw.pc \leftarrow ssw.pc + ('offset + 1)
            stop lookahead
      end
          {where 'offset \in [0...119], 'lo = 'offset mod 8. 'hi = \lfloor 'offset/8 \rfloor}
```

These operations do conditional forward branches if the selected condition code is a member of the condition mask. If taken, the skip skips the following of set instructions. The value mask is an eight-bit wide condition mask of type CondMask, described in §4. If mask is empty, then the skip always fails.

SKIP\_OFTEN and SKIP\_SELDOM can be monitored via CNT\_JUMP\_EXPECTED and CNT\_JUMP\_UNEXPECTED to observe branch prediction accuracy. SKIP only counts toward CNT\_TRANSFER\_TOTAL.

```
RAISES
```

(nothing)

COUNTS AS

CNT\_JUMP\_EXPECTED if expected path taken, CNT\_JUMP\_UNEXPECTED if unexpected path taken, CNT\_TRANSFER\_TOTAL

SEE ALSO

**JUMP** 

SKIP\_

```
(SSW_DISP x offset)
x - \text{SSW} + \text{infiset} + 1
\{ \text{where infiset} \in [0 \dots 119]. \text{ it } 0 = \text{infiset} \text{ mod } 8. \text{ it } 1 = [\text{infiset}/8] \}
(SSW_RESTORE u)
\text{SSW}.cv - u.cv
\text{SSW}.tm - u.tm
\text{SSW}.md - u.md
```

The SSW\_DISP operation is used to load a branch address into a general purpose register x, rather than a target register. Thus, the value may be later loaded into a target with TARGET\_RESTORE prior to jumping to the location. It also returns the trap, mode, and condition fields of the SSW.

The SSW\_RESTORE operation is used to set the trap, mode, and condition fields in the ssw. The value of CV after an instruction combining SSW\_RESTORE with a \_TEST C-op is undefined.

RAISES

(nothing)

SEE ALSO

TARGET\_RESTORE

(STATE\_LOAD\_DISP 
$$r \ s \ disp$$
)
$$r - (access \ state \ at \ s + 'disp)$$

$$\{ where \ 'disp \in [0 \dots 524287], \ 'sdisp = 'disp/8 \}$$
(STATE\_LOAD\_INDEX  $r \ s \ y$ )
$$r - (access \ state \ at \ s + 8 * y)$$

$$(state_{at \ s + 8 * y})$$

These operations load the access state(§6.1) from the addressed memory cell, and convert the access state into an access control field(§6.1) stored as a pointer in register  $\tau$ . Word alignment is not required; the byte select bits are ignored.

These operations are not subject to the trapping, forwarding, or memory full bit waiting normally controlled by the access state of the addressed memory location.

The pointer r is constructed as follows (the access control field is constructed by inverting the operations done by a STATE\_STORE operation):

- A copy of the forward enable bit (field "forward\_enable" in the access state) is placed in field "fwd\_disable" of r.
- A copy of data trap bit 0 (field "trap0\_enable" in the access state) is placed in both field "trap0\_store\_disable" and field "trap0\_load\_disable" of r.
- A copy of data trap bit 1 (field "trap1\_enable" in the access state) is placed in both field "trap1\_store\_disable" and field "trap1\_load\_disable" of r.
- The memory full bit (field "full" in the access state) is used to construct the full/empty control (field "fe\_control"). That field is set to FE\_FUTURE if the memory full bit is true, and is set to FE\_SYNC if the memory full bit is false.
- Other bits of  $\tau$  are set to 0.

RAISES

data\_hw\_error. data\_prot

COUNTS AS

CNTLOAD

SEE ALSO

STATE\_STORE. STATE\_LOCK

STATE\_LOAD\_

.

```
(STATE_LOCK_AC_DISP \( \tau \) ac \( disp \)
                                                                                                                                                                                 ... r s C ... ac sdisp 01
                                                                                                                                                                                                                                                                                        MC
                    r - (access state at s + 'disp), with 'ac:
                    (access state at s + idisp) — (forwarded, empty)
                             {where disp \in [0...16383]. sdisp = |disp/8|
(STATE_LOCK_AC_INDEX r s ac y)
                                                                                                                                                                                r_{64} r_{61} r_{65} r_{65} r_{47} r_{21} r_{16} r_{11} r_{60}
                                                                                                                                                                                                                                                                                        MC
                    r \leftarrow (access state at s + 8 * y). with 'ac:
                    (access state at s + 8 * y) — (forwarded, empty)
(STATE_LOCK_DISP r s disp)
                                                                                                                                                                                          r s C sdisp 
                                                                                                                                                                                                                                                                                       MC
                   r \leftarrow (access state at s + idisp):
                    (access state at s + 'disp) — (forwarded, empty)
                            {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(STATE_LOCK_INDEX r s y)
                                                                                                                                                                              MC
                   r — (access state at s + 8 * y):
                   (access state at s + 8 * y) — (forwarded, empty)
```

These operations allow atomic access state manipulation, with additional access control modification through the operand ac. The operation loads the access state( $\S6.1$ ) from the addressed memory cell, and converts the access state into an access control field( $\S6.1$ ) stored as a pointer in register r, as is done by a STATE\_LOAD operation. The operation then sets the access state stored in the addressed memory cell to forwarded and empty. Word alignment is not required: the byte select bits are ignored.

These operations are not subject to the forwarding or memory full bit waiting normally controlled by the access state of the addressed memory location, except that a location that is both forwarded and not full is considered locked. In this case, the operation fails and is retried later. However, data trap bits are observed as in a normal memory operation. If ac is present, its data trap0 and data trap1 disable bits are used: otherwise those of s are used.

The STATELOCK operation allows a possibly unforwarded memory word to be forwarded in an indivisible manner, locking the word with an "empty forwarding pointer" access state, while retrieving its current value.

```
RAISES

data_hw_error, data_prot

COUNTS AS

CNT_STORE

SEE ALSO

STATE_LOAD, STATE_STORE
```

```
(STATE_SCRUB_DISP r s disp)
                                                                     MC
        dsyn — (data syndrome at s + idisp)
       asyn - (access syndrome at s + idisp)
       data - (word at s + idisp)
       for i \in [0...7]: r_i - dsyn_i
       for i \in [8...11] : r_i - asyn_{i-8}
       for i \in [12...63]: r_i - data_i
          {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(STATE_SCRUB_INDEX r s y)
                                                                _{64} \cdots _{61} {}_{56} {}_{51} {}_{47} \cdots _{21} {}_{16} {}_{11} {}_{11} {}_{6} {}_{2} {}_{38}
                                                                                                       MC
       dsyn \leftarrow (data \ syndrome \ at \ s + 8 * y)
       asyn - (access syndrome at s + 8 * y)
       data - (word at s + 8 * y)
      for i \in [0...7] : r_i - dsyn_i
      for i \in [8...11] : r_i - asyn_{i-8}
      for i \in [12 \dots 63] : r_i - data_i
```

These operations atomically load and store the access state (§6.1) and load the data of the addressed memory cell to correct single-bit errors before they become uncorrectable multiple bit errors. Multiple-bit errors must be detected by examination of the syndrome bits. The combined syndrome bits for the data word and access state are returned. If there are no errors detected by the error-correction control logic, then these bits will be zero.

The appendix details the syndrome values returned.

These operations are not subject to the trapping, forwarding, or memory full bit waiting normally controlled by the access state of the addressed memory location.

#### RAISES

data\_hw\_error, data\_prot. data\_alignment, data\_blocked

STATE\_SCRUB

```
(STATE_STORE_AC_DISP r s ac disp)
                                                                                                                                                                                                                                                                                                                              r_{51} r_{52} r_{53} r_{54} r_{51} r_{52} r_{53} r_{54} r
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           MC
                                         (word at s + 'disp) — r:
                                        (access state at s + idisp) — accesscontrol(s), with 'ac
                                                         {where disp \in [0...16383]. sdisp = disp/8}
      (STATE_STORE_AC_INDEX r s ac y)
                                                                                                                                                                                                                                                                                                                             r_{51}, r_{55}, r_{51}, r_{5
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          MC
                                       (word at s + 8 * y) — r:
                                       (access state at s + 8 * y) — accesscontrol(s), with 'ac
    (STATE_STORE_DISP r s disp)
                                                                                                                                                                                                                                                                                                                                                 s, r, s, E, s disp, s
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          MC
                                     (word at s + 'disp) — r:
                                     (access state at s + idisp) — accesscontrol(s)
                                                      {where 'disp \in [0...524287], 'sdisp = 'disp/8}
(STATE\_STORE\_INDEX r s y)
                                                                                                                                                                                                                                                                                                                         MC
                                     (word at s + 8 \neq y) — r:
                                     (access state at s + 8 * y) — accesscontrol(s)
```

These operations store both the access state( $\S6.1$ ) and the value in the addressed memory cell. The value comes from register r. The access state comes from the access control field( $\S6.1$ ) of register s, inverting the encoding done by a STATE\_LOAD operation.

These operations are not subject to the trapping, forwarding, or memory full bit waiting normally controlled by the access state of the addressed memory location.

The access state in the memory cell is constructed from the access control field of s as follows:

- A copy of the forward disable bit (field "fwd\_disable" in the access control) is placed in the forward enable bit, field "forward\_enable", in the access state.
- The logical OR of field "trap0\_store\_disable" and field "trap0\_load\_disable" is placed in the "data trap 0" enable bit. field "trap0\_enable", in the access state.
- The logical OR of field "trap1\_store\_disable" and field "trap1\_load\_disable" is placed in the "data trap 1" enable bit, field "trap1\_enable", in the access state.
- o The memory full bit, field "full", is set if the full/empty control (field "fe\_control") is set to FE\_FUTURE; with FE\_SYNC, the memory full bit is cleared. The result is undefined if the field is set to FE\_NORMAL.

The exception is that if the ac (access control) operand is present, then the forwarding and data trap disable bits and full/empty control bits from ac replace those from s.

## RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNT\_STORE

SEE ALSO

STATELOAD, STATELOCK

```
(STATE_STORE_ERROR_DISP r \ s \ disp)

(word at s + \ disp) — r:
(access state at s + \ disp) — corrected access state

{where \ disp \in [0 \dots 16383], \ s disp = \ disp/8}

(STATE_STORE_ERROR_INDEX r \ s \ y)

(word at s + 8 * y) — r:
(access state at s + 8 * y) — corrected access state
```

These operations store the value and correct the access state( $\S6.1$ ) in the addressed memory cell, as long as there is a correctable error in the old value stored. The value comes from register r.

These operations are not subject to the trapping, forwarding, or memory full bit waiting normally controlled by the access state of the addressed memory location.

## RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked
COUNTS AS
CNT\_STORE
SEE ALSO
STATE\_SCRUB

STATE\_STORE\_ERROR\_

These operations store a byte at the addressed location. If ac is present, it is used; otherwise the access control field of s is used.

# RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNT\_STORE

SEE ALSO

INT\_LOADB, UNS\_LOADB

(STOREH 
$$r s$$
)

(halfword at  $s$ ) —  $r$ 

(STOREH\_AC\_DISP  $r s$  ac  $disp$ )

(halfword at  $s + idisp \mod 2^{48}$ ) —  $r$ , with  $idisp$  (halfword at  $s + idisp \mod 2^{48}$ ) —  $r$ , with  $idisp$  (STOREH\_AC\_INDEX  $r s$  ac  $y$ )

(halfword at  $s + 4 * y \mod 2^{48}$ ) —  $r$ , with  $idisp$  (STOREH\_DISP  $r s$   $disp$ )

(halfword at  $s + 4 * y \mod 2^{48}$ ) —  $r$ , with  $idisp$  (STOREH\_DISP  $r s$   $disp$ )

(halfword at  $s + idisp \mod 2^{48}$ ) —  $r$ 

{where  $idisp \in [0 \dots 524287]$ ,  $idisp = idisp/4$ }

(STOREH\_INDEX  $r s y$ )

(halfword at  $s + 4 * y \mod 2^{48}$ ) —  $r$ 

{where  $idisp \in [0 \dots 524287]$ ,  $idisp = idisp/4$ }

(STOREH\_INDEX  $idisp \in [0 \dots 524287]$ ,  $idisp = idisp/4$ }

These operations store a halfword at the addressed location. If ac is present, it is used; otherwise the access control field of s is used.

# RAISES

data\_hw\_error. data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNT\_STORE

SEE ALSO

INTLOADH, UNSLOADH

STOREH\_

(STOREQ\_rs)

(quarterword at s) - r

(STOREQ\_AC\_DISP rs ac disp)

(quarterword at s + idisp mod 
$$2^{46}$$
) - r, with 'ac

(where 'disp \in [0 \ldots 16383], 'sdisp = 'disp/2}

(STOREQ\_AC\_INDEX rs ac y)

(quarterword at s + 2 \* y mod  $2^{48}$ ) - r, with 'ac

(STOREQ\_DISP rs disp)

(quarterword at s + idisp mod  $2^{48}$ ) - r, with 'ac

(STOREQ\_DISP rs disp)

(quarterword at s + idisp mod  $2^{48}$ ) - r

{where 'disp \in [0 \ldots 524287], 'sdisp = 'disp/2}

(STOREQ\_INDEX rs y)

(quarterword at s + 2 \* y mod  $2^{48}$ ) - r

{where 'disp \in [0 \ldots 524287], 'sdisp = 'disp/2}

(STOREQ\_INDEX rs y)

(quarterword at s + 2 \* y mod  $2^{48}$ ) - r

These operations store a quarterword at the addressed location. If ac is present, it is used; otherwise the access control field of s is used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNT\_STORE

SEE ALSO

INT\_LOADQ, UNS\_LOADQ

(STORE 
$$rs$$
)

(word at  $s$ ) —  $r$ 

(STORE\_AC\_DISP  $rs$  ac  $disp$ )

(word at  $s + {}^{\prime}disp \mod 2^{48}$ ) —  $r$ , with  ${}^{\prime}ac$ 
{where  ${}^{\prime}disp \in [0 \dots 16383]}$ .  ${}^{\prime}sdisp = {}^{\prime}disp/8}$ 

(STORE\_AC\_INDEX  $rs$  ac  $y$ )

(word at  $s + 8 * y \mod 2^{48}$ ) —  $r$ , with  ${}^{\prime}ac$ 

(STORE\_DISP  $rs$   $disp$ )

(word at  $s + 6 * y \mod 2^{48}$ ) —  $r$ , with  ${}^{\prime}ac$ 

(STORE\_DISP  $rs$   $disp$ )

(word at  $s + {}^{\prime}disp \mod 2^{48}$ ) —  $r$ 

{where  ${}^{\prime}disp \in [0 \dots 524287]$ ,  ${}^{\prime}sdisp = {}^{\prime}disp/8}$ 

(STORE\_INDEX  $rs$   $y$ )

(word at  $s + 8 * y \mod 2^{48}$ ) —  $r$ 

(word at  $s + 8 * y \mod 2^{48}$ ) —  $r$ 

These operations store a word at the addressed location. If ac is present, it is used; otherwise the access control field of s is used.

### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked
COUNTS AS
CNT\_STORE
SEE ALSO

LOAD, STOREB

STORE\_

The STREAM\_CUR\_SAVE and STREAM\_RES\_SAVE operations respectively return the number of streams currently executing and the number of streams currently reserved in the stream's protection domain.

The STREAM\_IDENTIFIER\_SAVE operation is meant to help diagnose the stream management hardware. STREAM\_IDENTIFIER\_SAVE returns the issuing stream's stream number that was assigned by the hardware when the stream was created.

The STREAM\_LOOKAHEAD\_SAVE operation is used to read the three-bit lookahead lock counter index. It is used for mapping between the OPA and OPD registers of the M-unit and the data result codes. Since the data result codes are four-bit values, the index returned is scaled by four.

RAISES

(nothing)

SEE ALSO

STREAM\_RESERVE, STREAM\_CREATE, STREAM\_QUIT, §2

Stream Operations STREAM\_

```
(STREAM_CATCH r t x delay str)
                                           {}_{64}\cdots {}_{61}{}_{56}{}^{00} {}_{51}{}^{F}{}_{47}{}^{t}{}_{42}{}^{06}{}_{37}{}^{00}{}_{32}delay{}_{30}{}^{0}{}_{27}{}^{00}{}_{21}{}^{x}{}_{16}{}^{01}{}_{15}str{}_{8}delay{}_{6}{}^{00}{}_{0}
                                                                                                                               MAC
        SSW.pc — newpc;
        SSW.md - newmd;
        SSW.tm - newtm:
        ssw.cv - 0;
        D ← newdomain;
       LEVEL - newlevel;
       r \leftarrow data:
       t \leftarrow data;
       x \leftarrow data;
       T0 - data;
       EXCEPTION is cleared;
       RESULTCODE is cleared;
      instruction counter - 0;
```

This operation is internally generated by the hardware to complete the execution of a STREAM\_CREATE instruction. As such, IPL privilege is required to explicitly execute it.

```
RAISES
(nothing)
SEE ALSO
STREAM_CREATE
STREAM_CATCH_
```

```
(STREAM_COUNT_INST x)
x = (instruction counter)
(STREAM_COUNT_INST_RESTORE x y)
x = (instruction counter)
(instruction counter)
(instruction counter) = y
```

These operations manipulate the stream's instruction counter. STREAM\_COUNT\_INST returns the counter. STREAM\_COUNT\_INST\_RESTORE sets the instruction counter, which then counts down, stopping once it hits zero. When the instruction counter steps from one down to zero, the instruction count exception is raised.

There is also an instruction issue counter for each protection domain for accounting and performance measurement.

```
RAISES
(nothing)
SEE ALSO
COUNTISSUES, §10.1
```

```
(STREAM_CREATE_IMM r t u r y offset)
                                                                                                                                                                                                                                                                                    r_{61} \cdots r_{61} = 00 F t_{47} = t_{42} = t_{37} = t_{27} = t_{16} = t_{11} = t_{12} = t_{16} = t_{12} = t_{16} = t_{
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             MAC
                                 if SCUR_D = SRES_D then
                                                         raise create exception
                                else
                                                       SCUR_D \leftarrow SCUR_D + 1;
                                                       SSW.pc' - SSW.pc + foffset:
                                                       ssw.md' \leftarrow ssw.md;
                                                       ssw.tm' \leftarrow ssw.tm:
                                                      ssw.cv' \leftarrow 0;
                                                      D' \leftarrow D:
                                                     LEVEL' - LEVEL;
                                                  \tau' \leftarrow \tau;
                                                     t' \leftarrow u:
                                                     x' \leftarrow y;
                                                   T0' - T0;
                                                   EXCEPTION' is cleared:
                                                   RESULTCODE' is cleared:
                                                  instruction counter' - 0;
                                         {where 'offset \in [-2^{14} \dots 2^{14} - 1], 'lo = 'offset mod 1024, 'hi = \lfloor \text{'offset/1024} \rfloor}
```

This operation attempts to create a new instruction stream. The unprimed registers represent registers in the old stream. The primed registers represent registers in the new stream.

If the current number of streams executing in the protection domain  $(SCUR_D)$  is less than the number reserved by prior STREAM\_RESERVE\_operations  $(SRES_D)$ , then the STREAM\_CREATE succeeds, and state is copied from the current stream into the new stream.

By convention, target register T0 contains the ssw for the current trap handler. These registers are duplicated in the new stream. The three general-purpose registers r, u, and y that are copied into the new stream will typically be the frame pointer, an argument pointer, and some stream identifier. The general purpose registers other than r, t, and x, the trap registers, and target registers (other than T0) are undefined. The exception register and result code register are cleared (a safe state). The instruction count register is set to zero, which will not cause a trap on issues in the stream.

Streams must first be reserved with a STREAM\_RESERVE operation, can then created with a STREAM\_CREATE operation, and are then killed off with a STREAM\_QUIT operation.

```
RAISES
create
COUNTS AS
CNT_CREATE
SEE ALSO
STREAM_RESERVE. STREAM_QUIT, §12.1
STREAM_CREATE_
```

```
(STREAM_QUIT)

SCURD — SCURD — 1;

SRESD — SRESD — 1:
release trap registers:
release OPA/OPD slots:
stop execution:

(STREAM_QUIT_PRESERVE)

SCURD — SCURD — 1:
release trap registers:
release OPA/OPD slots:
stop execution:
```

The QUIT operation is supervisor-privileged if field "priv\_quit" is set in the program state descriptor. Privileged quit mode lets the operating system clear the stream's state before returning it to the hardware for reallocation (possibly to another protection domain).

Lookahead beyond a STREAM\_QUIT is forbidden. Streams which allow such lookahead will force the STREAM\_QUIT to be retried, wasting issue slots.

The QUIT\_PRESERVE operation preserves the reservation for the current stream.

```
RAISES

privileged

COUNTS AS

CNT_QUIT

SEE ALSO

STREAM_CREATE, STREAM_RESERVE
```

```
(STREAM_RESERVE t u st)
                                                                        t u 0 st 08 \dots t u 0 st 08 \dots
       req - if 'st = 0 then u else 'st end:
       t - if (req \leq S - \sum_{i} SRES_{i} - limbo)
        and (req \leq SLIM_D - SRES_D)
       then \tau eq else 0 end:
       SRES_D - SRES_D + t
          {where st \in [0...255]}
(STREAM_RESERVE_TEST t u st)
                                                                       t u 0 st 09 \dots t u 0 st 09 \dots 0 st 09 \dots 0
                                                                                                    A
       req - if 'st = 0 then u else 'st end;
       t - if (req \leq S - \sum_{i} SRES_{i} - limbo)
        and (req \leq SLIM_D - SRES_D)
       then \tau eq else 0 end;
       SRES_D \leftarrow SRES_D + t
          {where 'st \in [0...255]}
(STREAM_RESERVE_UPTO t u st)
                                                                       t_{47} u_{42} 1_{37} st_{27} 08 \dots
      req - if 'st = 0 then u else 'st end;
      t \leftarrow \min(req, S - \sum_{i} SRES_{i} - limbo, \max(SLIM_D - SRES_D, 0));
      SRES_D - SRES_D + t
          \{ where \ 'st \in [0...255] \}
(STREAM_RESERVE_UPTO_TEST t u st)
                                                                    t_{0} 1 st 09 ... A
      req - if 'st = 0 then u else 'st end;
      t \leftarrow \min(req, S - \sum_{i} SRES_{i} - limbo, \max(SLIM_{D} - SRES_{D}, 0));
      SRES_D - SRES_D + t
         {where st \in [0...255]}
```

These operations are used to reserve streams prior to creating streams with the STREAM\_CREATE operation.

The number of streams reserved for the protection domain to which this stream belongs,  $SRES_D$ , is incremented by req if possible. The result register t reflects the amount by which  $SRES_D$  was actually changed. The register u is an unsigned integer.

The resulting stream reservation will not exceed the larger of the current reservation and  $SLIM_D$ , the maximum number of streams allocated to this protection domain. In addition, the sum of all reservations cannot exceed S, the number of streams available in the processor. The operating system may encourage(inhibit) parallelism by setting  $SLIM_D$  above(below)  $SRES_D$ .

The \_TEST versions of these operations never generate overflow/NaN, and generate carry if the reservation  $SRES_D$  was changed by exactly req.

```
RAISES
```

(nothing)

SEE ALSO

STREAM\_CREATE, STREAM\_QUIT

STREAM\_RESERVE\_

(TARGET\_DISP tn offset)
$$tn - ssw.pc + 'offset$$

$$\{where 'offset \in [-2^{14} \dots 2^{14} - 1]\}$$
(TARGET\_INDEX tn u)
$$tn - ssw.pc + u$$
(TARGET\_RESTORE tn u)
$$tn \leftarrow u$$
(TARGET\_SAVE x tn)
$$x \leftarrow StreamStatusWord(ssw.cv, ssw.tm, ssw.md, tn)$$

$$(tn - ssw.pc + u)$$

$$tn - u$$
(CARGET\_Save x tn)
$$tn \leftarrow u$$
(CARGET\_save x tn)
$$tn \leftarrow u$$
(CARGET\_save x tn)
$$tn \leftarrow u$$

These operations establish target values for the program counter to be used by a subsequent branch operation, i.e. one of the operations from the JUMP\_family, and LEVEL\_RTN.

There are eight target registers addressed by tn. The TARGET\_DISP and TARGET\_INDEX operations set the addressed target, using the values currently in the ssw.pc. TARGET\_SAVE saves the StreamStatusWord with the pc replaced by the addressed target in register x.

Conversely, TARGET\_RESTORE restores the addressed target from register u.

If field "priv\_t0" is set in the program state descriptor, setting target zero is supervisor-privileged; This allows trap-handlers to be trustworthy. See §8.5.

When a target register is loaded, the corresponding program instructions are prefetched; see §7.2. Separating the TARGET from the JUMP allows the instruction fetch latency to be hidden.

RAISES

privileged

COUNTS AS

CNT\_TARGET if not TARGET\_SAVE

SEE ALSO

JUMP\_, LEVEL\_RTN, SSW\_DISP

(TRAP\_RESTORE 
$$tr y$$
)

(trap register at  ${}^{i}tr$ ) —  $y$ :
{where  ${}^{i}tr \in [0 \dots 7]$ }

(TRAP\_SAVE  $x tr$ )

 $x = (\text{trap register at } {}^{i}tr)$ 
{where  ${}^{i}tr \in [0 \dots 7]$ }

These operations save and restore values to the trap registers. By convention, these operations are used by the trap handler. The TRAP\_RESTORE operation does not check for poison on y. This exception allows the trap handler to free a register without raising a spurious poison exception.

Trap registers are allocated dynamically; see §9.2.

**RAISES** 

(nothing)

TRAP\_

| (UNS_CEIL t u)                                           | $\underbrace{t}_{64} \cdots \underbrace{t}_{47} \underbrace{u}_{42} \underbrace{18}_{37} \underbrace{08}_{27} \underbrace{08}_{27} \cdots \underbrace{0}_{21} \cdots \underbrace{0}_{0}$ | A                    |
|----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|
| t — unsigned integer ceiling of float $u$                |                                                                                                                                                                                          |                      |
| (UNS_CEIL_TEST t u)                                      | $ \underbrace{t}_{64} \underbrace{u}_{47} \underbrace{18}_{42} \underbrace{08}_{37} \underbrace{09}_{27} \underbrace{09}_{21} \underbrace{0}_{0} $                                       | A                    |
| t — unsigned integer ceiling of float $u$                |                                                                                                                                                                                          |                      |
| (UNS_CHOP t u)                                           | $\dots t_{47} \underbrace{u}_{42} \underbrace{18}_{37} \underbrace{09}_{27} \underbrace{08}_{21} \dots \underbrace{0}_{0}$                                                               | A                    |
| t — unsigned integer chop of float $u$                   |                                                                                                                                                                                          |                      |
| $(UNS\_CHOP\_TEST \ t \ u)$                              | $t_{47} u_{37} 18_{32} 09_{27} 09_{21} \dots$                                                                                                                                            | $\mathbf{A}_{\cdot}$ |
| $t \leftarrow unsigned integer chop of float u$          |                                                                                                                                                                                          |                      |
| $(UNS\_FLOOR \ t \ u)$                                   | $t_{47} u_{37} 18_{32} 04_{27} 08_{21} \dots$                                                                                                                                            | A                    |
| t — unsigned integer floor of float $u$                  |                                                                                                                                                                                          |                      |
| (UNS_FLOOR_TEST t u)                                     |                                                                                                                                                                                          | A                    |
| t — unsigned integer floor of float $u$                  |                                                                                                                                                                                          |                      |
| (UNS_NEAR t u)                                           | $ \begin{array}{cccccccccccccccccccccccccccccccccccc$                                                                                                                                    | A                    |
| $t \leftarrow \text{unsigned integer nearest float } u$  |                                                                                                                                                                                          |                      |
| (UNS_NEAR_TEST t u)                                      |                                                                                                                                                                                          | A                    |
| $t \leftarrow unsigned integer nearest float u$          |                                                                                                                                                                                          |                      |
| $(UNS\_ROUND \ t \ u)$                                   | $ \underbrace{t}_{47} \underbrace{t}_{42} \underbrace{u}_{37} \underbrace{18}_{32} \underbrace{0E}_{27} \underbrace{08}_{21} \dots \underbrace{0}_{0} $                                  | A                    |
| $t \leftarrow \text{unsigned integer round of float } u$ |                                                                                                                                                                                          |                      |
| (UNS_ROUND_TEST t u)                                     | $ t_{47} u_{37} 18_{32} 0E_{27} 09_{21} \dots $                                                                                                                                          | A                    |
| $t \leftarrow \text{unsigned integer round of float } u$ |                                                                                                                                                                                          |                      |

These operations convert floating-point numbers into unsigned integers. The roundings are directed as in IEEE Standard 754. UNS\_ROUND uses the rounding mode in the ssw.

A float invalid exception is raised when the result is negative or too large to represent. In these cases the result is reduced modulo  $2^{64}$ .

The \_TEST versions of these operations never generate carry or overflow/NaN.

# **RAISES**

float\_invalid. float\_inexact

# SEE ALSO

FLOAT\_INT, FLOAT\_UNS, INT\_CEIL, INT\_CHOP, INT\_FLOOR, INT\_NEAR, INT\_ROUND, FLOAT\_CEIL, FLOAT\_CHOP, FLOAT\_FLOOR, FLOAT\_NEAR, FLOAT\_ROUND

(UNS\_ADD\_CARRY\_TEST 
$$z$$
  $y$   $z$ )
$$z - y + z + \text{CV}_{I}.carry, \text{ integer}$$
(UNS\_SUB\_CARRY\_TEST  $z$   $y$   $z$ )
$$z - y + \neg z + \text{CV}_{I}.carry, \text{ integer}$$

$$(UNS_{I} = \frac{z}{z} + \frac{y}{z} + \frac{z}{z} = \frac{z}{z} =$$

These operations are intended to be used in multi-word integer add, subtract, and multiply. Note that carry-in is taken from  $cv_1$ , to simplify use of these operations in loops.

RAISES
(nothing)
SEE ALSO
INT\_ADD, INT\_SUB

UNS\_ADD\_CARRY\_

(UNS\_ADD\_MUL\_UPPER 
$$t \ u \ v \ w$$
)
$$t - (u + v * w)/2^{64}. \text{ unsigned}$$
(UNS\_ADD\_MUL\_UPPER\_TEST  $t \ u \ v \ w$ )
$$t - (u + v * w)/2^{64}. \text{ unsigned}$$
A
$$t - (u + v * w)/2^{64}. \text{ unsigned}$$

The UNS\_ADD\_MUL\_UPPER operation is used to implement multiple-precision integer multiplication.

The operation produces the high 64 bits of u + v \* w. When u, v, and w contain 52-bit unsigned integers, the only possible effect of u on this result is via a carry from its position in the low 52 bits. The low bits of the unsigned add-multiply may be obtained from an INT\_ADD\_MUL operation, even though its operands are ordinarily interpreted as signed two's-complement values.

There is no UPPER\_SUB\_MUL because multi-word subtract-multiply does not use it.

The LTEST version of this operation never generates overflow/NaN or carry.

If v or w is outside  $[-2^{53} \dots 2^{53} - 1]$ , the float\_extension exception is raised.

RAISES

float\_extension

SEE ALSO

INT\_ADD\_MUL, INT\_SUB\_MUL, INT\_SUB\_MUL\_REV

(UNS\_DIV 
$$t \ u \ v \ w$$
)

 $exp$  — unbiased exponent of  $w$ 
 $temp - v * w/2^{exp}$ , round to floor

 $t - temp * 2^{exp}$ , round to floor

(UNS\_DIV\_TEST  $t \ u \ v \ w$ )

 $exp$  — unbiased exponent of  $w$ 
 $temp - v * w/2^{exp}$ , round to floor

 $t - temp * 2^{exp}$ , round to floor

 $t - temp * 2^{exp}$ , round to floor

 $t - temp * 2^{exp}$ , round to floor

These operations are used to implement integer division. The product from unsigned v and SpecialFloat64 w is shifted right according to exp and rounded, producing an unsigned integer.

The \_TEST version of this operation generates carry when the quotient is not exact, i.e. when the division by  $2^{-exp}$  yields a non-zero remainder. This operation never generates overflow.

If v is outside  $[0...2^{53} - 1]$ , the float\_extension exception is raised and the result in t may be incorrect.

Although register u is not used in the current hardware implementation, the software requires u to contain the denominator in order to properly handle float\_extension exceptions.

RAISES
float\_extension
SEE ALSO
INT\_DIV\_CHOP, INT\_DIV\_FLOOR, §12.6

UNS\_DIV\_

(UNS\_LOADB 
$$r$$
  $s$ )
$$r - zero \ extend(byte \ at \ s). \ with FE_NORMAL$$
(UNS\_LOADB\_AC\_DISP  $r$   $s$   $ac$   $disp$ )
$$r - zero \ extend(byte \ at \ s + 'disp \ mod \ 2^{48}), \ with 'ac$$

$$\{ where 'disp \in [0 \dots 16383] \}$$
(UNS\_LOADB\_AC\_INDEX  $r$   $s$   $ac$   $y$ )
$$r - zero \ extend(byte \ at \ s + y \ mod \ 2^{48}), \ with 'ac$$
(UNS\_LOADB\_DISP  $r$   $s$   $disp$ )
$$r - zero \ extend(byte \ at \ s + y \ mod \ 2^{48}), \ with 'ac$$
(UNS\_LOADB\_DISP  $r$   $s$   $disp$ )
$$r - zero \ extend(byte \ at \ s + 'disp \ mod \ 2^{48}), \ with FE_NORMAL$$

$$\{ where 'disp \in [0 \dots 524287] \}$$
(UNS\_LOADB\_INDEX  $r$   $s$   $y$ )
$$r - zero \ extend(byte \ at \ s + y \ mod \ 2^{48}), \ with FE_NORMAL$$

$$\{ where 'disp \in [0 \dots 524287] \}$$

These operations load an unsigned byte from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used: otherwise those of s are used.

# RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STOREB, INTLOADB

These operations load an unsigned halfword from memory. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used; otherwise those of s are used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked COUNTS AS

CNTLOAD

SEE ALSO

STOREH. INT\_LOADH

UNS\_LOADH\_

(UNS\_LOADQ\_rs)

$$r$$
— zero extend(quarterword at s), with FE\_NORMAL

(UNS\_LOADQ\_AC\_DISP\_rs ac disp)

 $r$ — zero extend(quarterword at  $s$ + idisp\_mod\_248), with iac

 $r$ — zero extend(quarterword at  $s$ + idisp\_mod\_248), with iac

 $r$ — zero extend(quarterword at  $r$ + idisp\_mod\_248), with iac

 $r$ — zero extend(quarterword at  $r$ + idisp\_mod\_248), with iac

(UNS\_LOADQ\_AC\_INDEX\_rs ac y)

 $r$ — zero extend(quarterword at  $r$ + idisp\_mod\_248), with iac

(UNS\_LOADQ\_DISP\_rs disp)

 $r$ — zero extend(quarterword at  $r$ + idisp\_mod\_248), with FE\_NORMAL

 $r$ + idisp\_mod\_248), with FE\_NORMAL

These operations load an unsigned quarterword from memory. When the destination register r is r0 with UNS\_LOADQ, no operation is performed. The fe\_control is taken from the ac field if present, or forced to FE\_NORMAL. If ac is present, its forward, data trap0, and data trap1 disable bits are used; otherwise those of s are used.

#### RAISES

data\_hw\_error, data\_prot, data\_alignment, data\_blocked

COUNTS AS

CNTLOAD

SEE ALSO

STOREQ, INTLOADQ

(UNS\_RECIP\_SHIFT 
$$x y$$
)
$$x - log_2 y, \text{ round to ceiling}$$
(UNS\_RECIP\_SHIFT\_TEST  $x y$ )
$$x - log_2 y, \text{ round to ceiling}$$

$$(UNS_RECIP_SHIFT_TEST x y)$$

$$x - log_2 y, \text{ round to ceiling}$$

These operations are used to compute integer reciprocals. They compute the ceiling of the log base 2 of y. When y is zero, x is set to -1.

RAISES
(nothing)
SEE ALSO
UNS\_DIV, §12.6

UNS\_RECIP\_SHIFT\_

(UNS\_SHIFT\_RIGHT 
$$x y z$$
)
$$x - y \gg z$$
(UNS\_SHIFT\_RIGHT\_TEST  $x y z$ )
$$x - y \gg z$$

$$(UNS_SHIFT_RIGHT_TEST x y z)$$

$$x - y \gg z$$

These operations shift to the right, filling in 0's on the left. Unsigned shift counts in z are taken modulo 64.

The \_TEST version generates carry if a 1-bit is shifted out of w or y, and never generates overflow/NaN.

RAISES

(nothing)

SEE ALSO

INT\_SHIFT\_RIGHT, SHIFT\_LEFT, SHIFT\_PAIR\_RIGHT

# Chapter 12: Programming Examples

# 12.1 Stream Creation

[[This needs examples.]]

# 12.2 Forwarding Pointers

[[This needs examples.]]

# 12.3 Vector Loops

Consider this Fortran loop to compute the inner product:

```
<fortran inner product>=
    do 100 i = 1, n
        sum = sum + a(i)*b(i)
100    continue
```

It could be compiled to the TERA program fragment shown below (omitting most of the preamble, and with the loop unrolled):

```
<tera inner product>≡
       (allocate-from-reg Reg 1)
       (assign-reg Reg
         ntrips sum sumoffset
         stackpointer
         ai apointer astep
         bi bpointer bstep
       (equ ZERO r0)
       (inst 1 (INT_LOAD ai apointer)
              (TARGET_DISP tl loop100)
                                                                           10
              (REG_MOVE sum ZERO))
(label loop100)
       (inst 0 (INT_LOAD bi bpointer)
              (INT_ADD apointer apointer astep)
              (INT_ADD bpointer bpointer bstep))
                                                                           15
       (inst 1 (INT_LOAD ai apointer)
              (FLOAT_ADD_MUL sum sum ai bi)
              (INT_ADD apointer apointer astep))
       (inst 0 (INT_LOAD bi bpointer)
              (INT_SUB_IMM_TEST ntrips ntrips 2)
                                                                           20
              (INT_ADD bpointer bpointer bstep))
       (inst 1 (INT_LOAD ai apointer)
              (FLOAT_ADD_MUL sum sum ai bi)
              (JUMP if_ile c0 t1))
      (inst 0 (STORE_NIDEX sum stackpointer sumoffset))
                                                                           25
```

The TARGET in the first instruction sets the target's program counter. The lookahead values let pairs of instructions in the loop run in parallel. The loop achieves one flop per instruction.

Another Fortran loop of interest is the following:

Loop transformations (strip mining and interchange) yield

```
<fortran inner product>≡
        do 201 \text{ ii} = 1,m - 19,20
                lasti = ii + 19
                do 101 j = 1,n
                        do 1 i = ii,lasti
                                c(i) = c(i) + a(i,j)*b(j)
1
                        continue
101
                continue
201
        continue
        do 202 i = lasti + 1,m
                do 102 j = 1.n
                                                                                   10
                        c(i) = c(i) + a(i,j)*b(j)
102
                continue
202
       continue
```

The do 1 loop may now be unrolled:

<unrolled fortran inner product>≡

```
do 201 \text{ ii} = 1.\text{m} - 19,20
                lasti = ii + 19
                do 101 j = 1,n
                        c(ii) = c(ii) + a(ii, j)*b(j)
                        c(II + 1) = c(II + 1) + a(II + 1,j)*b(j)
                        c(II + 2) = c(II + 2) + a(II + 2,j)*b(j)
                        c(II + 19) = c(II + 19) + a(II + 19,j)*b(j)
101
                continue
201
        continue
                                                                                    10
       do 202 i = lasti + 1,m
                do 102 j = 1,n
                        c(i) = c(i) + a(i,j)*b(j)
102
                continue
202
       continue
                                                                                    15
```

The do 101 loop now corresponds to the TERA assembly language fragment below (again, omitting the preamble):

```
<tera unrolled inner product>≡
       (allocate-from-reg Reg 1)
       (assign-reg Reg
         bj bij bpointer bstep
        ·aij apointer astep abigstep
         c_0 c_1 c_17 c_18 c_19
         )
       ; ...
       (inst 0 (TARGET_DISP tl loop101))
(label loop101)
       (inst 0 (INT_LOAD bj bpointer)
              (INT_ADD apointer apointer abigstep)
              (INT_ADD bpointer bpointer bstep))
       (inst 0 (INT_LOAD aij apointer)
              (INT_ADD apointer apointer astep))
       (inst 0 (INT_LOAD aij apointer)
                                                                             15
              (FLOAT_ADD_MUL c_0 c_0 aij bj)
              (INT_ADD apointer apointer astep))
       (inst 0 (INT_LOAD aij apointer)
              (FLOAT_ADD_MUL c_1 c_1 aij bj)
              (INT_ADD apointer apointer astep))
       ; ...
       (inst 0 (INT_LOAD aij apointer)
              (FLOAT_ADD_MUL c_17 c_17 aij bj)
              (INT_ADD apointer apointer astep))
       (inst 0 (INT_LOAD aij apointer)
                                                                             25
              (FLOAT_ADD_MUL c_18 c_18 aij bj)
              (INT_SUB_IMM_TEST ntrips ntrips 1))
       (inst 7 (FLOAT_ADD_MUL c_19 c_19 aij bj)
              (JUMP if_ile c0 t1))
                                                                             30
```

The lookahead values of 0 throughout most of the loop could be increased by using distinct aij and apointer registers for each value of i, but this consumes more registers and thereby decreases the unrolling factor.

Minimal lookahead and an unrolling factor of 20 yields 40 flops every 22 instructions or about 1.9 flop/instruction. (This figure could be improved to about 80 flops every 42 instructions by unrolling two iterations of the j loop.) On the other hand, a sustained lookahead of 7, the maximum possible, can be had by unrolling 8 iterations of the i loop and 2 of the j loop. This approach still yields a respectable 32 flops every 18 instructions, or about 1.8 flop/instruction, with eight times fewer streams needed. Once again, the somewhat lengthy preamble is omitted; in this case, it involves loading not just 8 ci's but 8 aij's as well. The loop uses 23 registers, so it is quite reasonable to expect a compiler to unroll as fully as this:

```
<tera fully unrolled inner product>≡
       (allocate-from-reg Reg 1)
       (assign-reg Reg
         ntrips
         a0j a1j a7j apointer ajstep
         bj0 bj1 bpointer bstep c0 c1 c7)
       (define AISTEP (double 'SIZE))
       (inst 0 (TARGET_DISP t1 loop))
(label loop)
       (inst 7 (INT_LOAD a0j apointer)
                                                                           10
              (FLOAT_ADD_MUL c0 c0 a0j bj0)
              (INT_ADD bpointer bpointer bstep))
       (inst 7 (INT_LOAD_DISP alj apointer AISTEP)
              (FLOAT_ADD_MUL c1 c1 alj bj0))
                                                                           15
      (inst 7 (INT_LOAD_DISP a7j apointer (* 7 AISTEP))
              (FLOAT_ADD_MUL c7 c7 a7j bj0))
      (inst 7 (INT_LOAD bj0 bpointer)
              (INT_ADD apointer apointer ajstep)
              (INT_ADD bpointer bpointer bstep))
                                                                           20
      (inst 7 (INT_LOAD a0j apointer)
             (FLOAT_ADD_MUL c0 c0 a0j bj1)
             (INT_SUB_IMM_TEST ntrips ntrips 2))
      (inst 7 (INT_LOAD_DISP alj apointer AISTEP)
             (FLOAT_ADD_MUL c1 c1 alj bj1))
                                                                           25
      (inst 7 (INT_LOAD_DISP a7j apointer (* 7 AISTEP))
             (FLOAT_ADD_MUL c7 c7 a7j bj1))
      (inst 7 (INT_LOAD bjl bpointer)
             (INT_ADD apointer apointer ajstep)
                                                                          30
             (JUMP if_ile c0 t1))
      ;...
```

## 12.4 Doubled Precision Floating-point Arithmetic

The TERA architecture provides support for 128-bit "doubled precision" floating-point arithmetic. A doubled precision representation is an ordered pair [X, x] of floating-point numbers in which x is insignificant compared to X, that is, round(X + x) = X. The only rounding mode supported in doubled precision arithmetic is "round to nearest".

We say [X,x] is the doubled precision representation of the real number  $\xi$  if  $X = \text{round}(\xi)$  and  $x = \text{round}(\xi - \text{round}(\xi))$ . It follows from the definition that round(X + x) = X. Testing the more significant part of a doubled precision value is sufficient because of this "normalization" property. Another important property is this: if A and B are floating-point numbers, then there is a unique, exact doubled precision representation for A + B.

The representation [X, x] of A + B is computed as follows:

```
<doubled precision math> \( \)
(define (doubled_add_single X x A B temp1)
  (inst 0 (FLOAT_MMAX temp1 A B) (FLOAT_ADD X A B))
  (inst 0 (FLOAT_MMIN temp1 A B) (FLOAT_SUB x temp1 X))
  (inst 0 (FLOAT_ADD x x temp1))
)
```

A doubled precision floating-point add. [Z, z] = [X, x] + [Y, y], is computed like this, where temp1, temp2, temp3, and temp4 are temporary registers that are distinct from those holding Z or z:

```
<doubled precision math>≡
```

```
(define (doubled_add Z z X x Y y templ temp2 temp3 temp4)
 (let* ((A temp1) (a temp2) (B temp3) (b temp4)
       (C B) (c a) (t z) (u Z)
      (inst 0 (FLOAT_MMAX t X Y) (FLOAT_ADD A X Y))
      (inst 0 (FLOAT_MMIN t X Y) (FLOAT_SUB a t A))
      (inst 0 (FLOAT_ADD a a t) (FLOAT_ADD B x y))
      (inst 0 (FLOAT_MMAX t x y) (FLOAT_ADD a a B))
      (inst 0 (FLOAT_SUB b t B) (FLOAT_ADD C A a))
      (inst 0 (FLOAT_MMIN u x y) (FLOAT_SUB c A C))
                                                                       10
      (inst 0 (FLOAT_ADD c c a) (FLOAT_ADD b b u))
      (inst 0 (FLOAT_ADD c c b))
      (inst 0 (FLOAT_ADD Z C c))
      (inst 0 (FLOAT_SUB z C Z))
      (inst 0 (FLOAT_ADD z z c))
                                                                       15
      )
 )
```

Doubled precision floating-point subtract is similar, with the y and Y operands explicitly negated.

```
<doubled precision math>≡
```

```
(define (doubled_sub Z z X x Y y templ temp2 temp3 temp4)
 (let* ((A temp1) (a temp2) (B temp3) (b temp4)
       (C B) (c a) (t z) (u Z)
       )
      (inst 0 (BIT_MASK a 63 63) (FLOAT_SUB A X Y))
      (inst 0 (BIT_XOR a Y a) (BIT_XOR u y a))
      (inst 0 (FLOAT_MMAX t X a) (FLOAT_SUB B x y))
      (inst 0 (FLOAT_MMIN t X a) (FLOAT_SUB a t A))
      (inst 0 (FLOAT_MMAX t x u) (FLOAT_ADD a a t))
      (inst 0 (FLOAT_SUB b t B) (FLOAT_ADD a a B))
                                                                       10
      (inst 0 (FLOAT_ADD C A a))
      (inst 0 (FLOAT_MMIN u x u) (FLOAT_SUB c A C))
      (inst 0 (FLOAT_ADD c c a) (FLOAT_ADD b b u))
      (inst 0 (FLOAT_ADD c c b))
      (inst 0 (FLOAT_ADD Z C c))
                                                                       15
      (inst 0 (FLOAT_SUB z C Z))
      (inst 0 (FLOAT_ADD z z c))
```

The doubled precision product  $[Z,z] = [X,x] \times [Y,y]$  is computed as follows, where temp1 and temp2 must be temporary registers distinct from those holding Z, z, X, x, Y, or y. This code relies on the fact that the floating-point multiply-add operations only round once.

An important special case occurs when products of working precision numbers X and Y are to be computed in doubled precision [Z, z]:

```
<doubled precision math>≡
(define (doubled_single_mul Z z X Y)
  (let ((ZERO r0))
        (inst 0 (FLOAT_ADD_MUL Z ZERO X Y))
        (inst 0 (FLOAT_MUL_LOWER z Z X Y))
        )
)
```

## 12.5 Floating-point Division and Square Root

While directly implementing floating-point division was deemed infeasible, the TERA architecture provides support for correctly rounded IEEE division. Starting with a reciprocal approximation, the adder-multiplier is used to compute a correctly rounded quotient.

The floating-point divide q = x/y is computed like this, where q, x and y are held in registers and temp1 is an additional temporary register which must be distinct from the first three.

```
<floating divide>≡
(define (float_divide q x y temp1 temp2)
  (let ((r temp1) (e temp2))
      (float_reciprocal r y q)
      (inst 0 (FLOAT_DIV_APPROX q y x r))
      (inst 0 (FLOAT_DIV_ERROR e x y q))
      (inst 0 (FLOAT_DIV q q e r))
 )
  <floating divide>≡
(define (float_reciprocal r y temp1)
 (let ((e templ))
      (inst 0 (FLOAT_RECIP_APPROX r y)); traps if y is denorm
      (inst 0 (FLOAT_RECIP_ERROR e y r))
      (inst 0 (FLOAT_ITER rrer))
      (inst 0 (FLOAT_RECIP_ERROR e y r))
      (inst 0 (FLOAT_ITER r r e r))
(define (float_reciprocal_denorm r y)
 (let ((templ r))
      (inst 0 (INT_IMM temp1 1074))
      (inst 0 (FLOAT_SCALB temp1 y temp1))
      (inst 0 (INT_RECIP_APPROX r temp1))
                                                                            13
 )
```

The first FLOAT\_RECIP\_APPROX gives a reciprocal that is accurate to 14 bits. The FLOAT\_RECIP\_ERROR and FLOAT\_ITER pairs form a Newton iteration, which raises the accuracy to 28 and then 54 bits. The product of the reciprocal r and the numerator x is a quotient q correct to 1 ulp. The final instructions compute a remainder to correctly round q and deliver the final quotient. When the divisor q is denormalized, the trap handler should execute the float\_reciprocal\_denorm code sequence to compute the correct initial reciprocal approximation.

Computing the square root is performed similarly. The square root  $q = \sqrt{y}$  is computed like this, where q and y are held in registers. and temp1 is an additional temporary register which must be distinct from the first two.

```
<floating square root>≡
(define (float_sqrt q y temp1 temp2)
 (let ((r temp1) (e temp2)
      (_float_sqrt q y r e)
       (inst 0 (FLOAT_SQRT_ERROR_TEST e y q q))
       (inst 0 (FLOAT_SQRT q q e r))
(define (_float_sqrt q y r e)
      (inst 0 (FLOAT_RSQRT_APPROX r y)); traps if y is denorm
                                                                         10
      (inst 0 (FLOAT_SQRT_APPROX_TEST q y y r))
      (inst 0 (FLOAT_RSQRT_ERROR_TEST e y q r))
      (inst 0 (FLOAT_ITER rrer))
      (inst 0 (FLOAT_SQRT_APPROX_TEST q y y r))
      (inst 0 (FLOAT_RSQRT_ERROR_TEST e y q r))
      (inst 0 (FLOAT_ITER rrer))
      (inst 0 (FLOAT_SQRT_APPROX_TEST q y y r))
(define (float_rsqrt_denorm r y)
 (let ((templ r))
      (inst 0 (INT_IMM temp1 1022))
      (inst 0 (FLOAT_SCALB templ y templ))
      (inst 0 (INT_RSQRT_APPROX r temp1))
 )
```

Here, the initial approximation to the reciprocal square root is correct to at least 14 bits. The FLOAT\_SQRT\_APPROX\_TEST uses the reciprocal root to compute an estimate of the square root. The two iterations improve the accuracy to the necessary 54 bits. The final operation computes the correct rounding for the delivered result. When the argument y is denormalized, the trap handler should execute the float\_rsqrt\_denorm code sequence to compute the correct initial reciprocal square root approximation.

## 12.6 Integer Division

The TERA architecture provides support for 64-bit integer division, including both signed and unsigned integer data types. The current implementation supports 53-bit integer division in hardware and longer operands in software. For signed division, we provide instructions to allow the quotient to be rounded toward zero by chopping, as in FORTRAN, or rounded toward negative infinity, so that  $q = \lfloor \frac{x}{y} \rfloor$ .

The FORTRAN integer divide q = x/y is computed like this, where q, x and y are held in registers, and temp1 is an additional temporary register, which must be distinct from the first three.

```
<integer divide>≡
(define (int_reciprocal r y temp1 temp2)
 (let ((fy temp2)
       (e templ)
      (inst 0 (FLOAT_INT fy y))
      (inst 0 (INT_RECIP_APPROX r fy))
      (inst 0 (INT_RECIP_ERROR e fy r))
      (inst 0 (FLOAT_ITER r r e r))
      (inst 0 (INT_RECIP_ERROR e fy r))
      (inst 0 (FLOAT_ITER r r e r))
                                                                            10
      (inst 0 (INT_DIV_CHOP e y y r)
           (INT_ADD_IMM r r 1))
      (inst 0 (INT_SUB r r e))
 )
                                                                            15
```

The first reciprocal approximation is correct to 14 bits. Two iterations raise that accuracy to 54 bits. The fix step is needed to guarantee that r is correctly rounded to the ceiling of the reciprocal. Finally, the divide instruction multiplies by the reciprocal and shifts right to compute the correct quotient.

Signed integer division with floored rounding is analogous, except the final INT\_DIV\_CHOP is replaced with INT\_DIV\_FLOOR so that the sign of the (implied) remainder agrees with the denominator rather than the numerator.

The unsigned quotient is computed similarly:

```
<unsigned divide>≡
 (define (uns_reciprocal r y temp1 temp2)
  (let ((fy
            temp2)
        (e
            templ)
        )
       (inst 0 (FLOAT_UNS fy y))
                                                                              5
       (inst 0 (INT_RECIP_APPROX r fy))
       (inst 0 (INT_RECIP_ERROR e fy r))
       (inst 0 (FLOAT_ITER r r e r))
      (inst 0 (INT_RECIP_ERROR e fy r))
      (inst 0 (FLOAT_ITER rrer))
                                                                             10
      (inst 0 (UNS_DIV e y y r)
            (INT_ADD_IMM r r 1))
      (inst 0 (INT_SUB r r e))
  )
  <unsigned divide>≡
(define (uns_divide q x y temp1)
  (let ((r temp1)
      (uns_reciprocal r y q)
      (inst 0 (UNS_DIV q y x r))
(define (uns_divide_test q x y temp1)
 (let ((r temp1)
                                                                             10
      (uns_reciprocal r y q)
      (inst 0 (UNS_DIV_TEST q y x r))
 )
```

Note that when the divisor is constant, the sequence simplifies down to one compute instruction (and one constant which must be materialized). For example, the following two instructions compute q = x/20.

## Chapter 13: I/O Processor Introduction

The I/O processor contains four instruction streams and a control word. The control word is used to assign a segment descriptor for fetching instructions and initializing the streams. The four streams have dedicated purposes: memory load, memory store, HIPPI input and HIPPI output. For each stream, instructions are fetched and executed in a linear fashion. If an exceptional event occurs during the execution of an instruction, the stream performs a link operation with the driver to inform it of the exception. A link operation is defined as a stream status word production, followed by a program counter consumption. The driver program must use consumer/producer semantics on the link address and the new control word address, respectively. The I/O processor ignores the forward, data trap, and memory full bits when fetching instructions.

- The memory load stream loads from data or I/O memory into the outbound data buffer.
- o The memory store stream stores from the inbound data buffer into data or I/O memory.
- The HIPPI output stream transfers bursts of data from the outbound data buffer out through the HIPPI interface.
- o The HIPPI input stream transfers bursts of received data into the inbound data buffer.

The HIPPI interface section of the IOP is designed to conform with the physical layer specification of the ANSI X3T9.3 committee. The basic HIPPI clock rate is 25 Mhz, and the timeout clock period is 1 microsecond. The longest timeout period is 16.8 seconds. The HIPPI out section of the IOP has an external fixed 50 Mhz clock which is used to generate the 25 Mhz timing and the IOP timeout clock. The HIPPI in section synchronizes to the source clock for its connection, yet uses the fixed timeout clock generated by the HIPPI out section.

Each half of the IOP can operate as either a 32- or 64-bit HIPPI channel. The channel width can be selected when making each connection. The load and store streams operate only on 64-bit words.

On power up, the IOP must not request or accept any connections until it has been initialized. To satisfy this requirement, some (external) power on reset circuit is needed. In addition, connections must be inhibited during scan.

#### 13.1 Link Status Word

Each stream in the I/O processor produces a link status word when it links. That word generally indicates the reason for linking, and the program counter at which the link occurred. The program counter in both link words is a byte offset into the instruction segment. The lower three bits are ignored, since the IOP only performs full word, aligned, instruction fetches. The field "exception" is set whenever one of the exceptions in bits 53 to 39 is set. The field "status\_link" is set whenever the field "pc" is not a program counter, but instead is some data such as an Ifield or error offset.

|                           | Valid    | Bits         | Wd    | Field Name                         | Type      | Description                                   |  |
|---------------------------|----------|--------------|-------|------------------------------------|-----------|-----------------------------------------------|--|
| IOPStatusWord: Exceptions |          |              |       |                                    |           |                                               |  |
|                           | LSOI     | 63           | 1     | exception                          | Flag      | Exception                                     |  |
|                           | LSI      | 62           | 1     | status_link                        | Flag      | Status result                                 |  |
|                           |          | 61-56        | 6     | os_field                           | Uns       | Reserved for O/S use; IOP writes 0            |  |
|                           |          | 55-54        | 2     | 0                                  |           | reserved                                      |  |
|                           | LSOI     | 5 <b>3</b> - | 1     | forced_link                        | Flag      | Forced link Operation                         |  |
|                           | LSOI     | 52           | 1     | link_error                         | Flag      | Link Error                                    |  |
|                           | LSOI     | 51           | 1     | p_limit_error                      | Flag      | Instruction Segment Limit Error               |  |
|                           | LSOI;    | 50           | 1     | <pre>p unimplemented address</pre> | Flag      | Instruction Address unimplemented on resource |  |
|                           | LSOI     | 49           | 1     | p_uncorrectable<br>error           | Flag      | Uncorrectable Instruction Error               |  |
|                           | LSOI     | 48           | 1     | illegal_op_code                    | Flag      | Illegal OP code                               |  |
|                           | S        | 47           | 1     | packet_end                         | Flag      | Packet End exception                          |  |
|                           | LSI      | 46           | 1     | limit_error                        | Flag      | Segment Limit Error                           |  |
|                           | LSI      | 45           | 1     | unimplemented<br>address           | Flag      | Address unimplemented on resource             |  |
|                           | LSOI     | 44           | 1     | uncorrectable<br>error             | Flag      | Uncorrectable Data Error                      |  |
|                           | OI       | 43           | 1     | connect_lost                       | Flag      | Connect lost                                  |  |
|                           | OI       | 42           | 1     | interconnect_lost                  | Flag      | Interconnection lost                          |  |
|                           | OI       | 41           | 1     | bad_connect                        | Flag      | Unable to request connection                  |  |
|                           | OI       | 40           | 1     | timeout                            | Flag      | Time out                                      |  |
|                           | 0        | 39           | 1     | no_connection                      | Flag      | No connect before OUT_PACKET                  |  |
|                           |          | 38-35        | 4     | 0                                  |           | reserved                                      |  |
| IOF                       | Status V | Vord: Si     | tatus |                                    |           |                                               |  |
|                           | LSOI     | 34-33        | 2     | Stream_identity                    | IopStream | Stream Identity                               |  |
|                           | LSOI     | 32           | 1     | loopback                           | Flag      | Loopback                                      |  |
| IOP                       | Status V | Vord: P      | C     |                                    |           |                                               |  |
|                           | LSOI     | 31–0         | 32    | pc                                 | Uns       | Program Counter or Ifield                     |  |

The field "Stream\_identity" is encoded using the following enumeration.

| Name      | Value | Meaning      | F-1-72-1-1-1-1 |
|-----------|-------|--------------|----------------|
| IopStream |       |              |                |
| IOP_LOAD  | 0     | Load stream  |                |
| IOP_OUT   | 1     | Out stream   |                |
| IOP_STORE | 2     | Store stream |                |
| IOP_IN    | . 3   | In stream    |                |

The link status words are arranged at word indices 0 through 7 in the IOP instruction segment.

13.1 Link Status Word

IopStream

Each stream has a pair, status and next pc. These are laid out in the following order: load\_status, load\_next\_pc, out\_status, out\_next\_pc, store\_status, store\_next\_pc, in\_status, in\_next\_pc.

# Chapter 14: I/O Operation Descriptions

I/O Processor programs have the same syntactic form as Lisp expressions. The CPU instructions are composed of several operations, IOP instructions always contain exactly one operation. Therefore, the INST wrapper is not needed in IOP assembly programs.

(INST\_SEGMENT dist\_en m\_type unit limit base)

$${}^{00}_{64} {}^{0}_{56} {}^{0}_{55} dist_{-}en_{s_{4}} m_{-} type_{s_{2}} {}^{0}_{48} unit_{40} {}^{0}_{39} limit_{24} {}^{0}_{19} base_{0} \qquad I$$

InstSegment — immediate data

The segment word holds the data address translation descriptor for IOP instructions. The format is the same as data map entries in the processor except that the load and store levels and locked bit are omitted.

The I/O Processor is reset or initialized through the segment word. The segment word is located at word offset 0 for the logical unit number assigned to the IOP. Whenever the segment word is written to, all four streams perform a link operation in the new segment. As there is only one program segment for all four streams, the IOP should be in an idle or suspect state before changing the segment descriptor. A write of the segment word also resets the inbound and outbound data buffers. Table 1 shows the bit allocation for the stream status returned by a link operation.

The p\_limit\_error. p\_unimplemented\_address, and p\_uncorrectable\_error exceptions during a link operation cause the stream to retry the link operation, possibly indefinitely.

Instructions are provided so that one stream may cause another stream to link. Generally, streams forced to link will link between instructions. However, a stream will abort an indefinite wait to link, so that deadlocked streams may be interrupted and reset.

| (LOAD_LINK)                      | 10,000000000000000000000000000000000000 | I |
|----------------------------------|-----------------------------------------|---|
| Link Load stream (LOAD_LINK_OUT) |                                         | _ |
| Link Out stream                  | 14,000000000000000000000000000000000000 | 1 |

The LINK operation forces the memory load stream to perform a link operation with the device driver. The link operation pair for the memory load stream is located at word offset 0 and 1 of the program segment for IOP code. No exception is raised.

The LOAD\_LINK\_OUT operation forces the out stream to perform a link operation. The out stream will link at the next opportunity, generally after completing the current instruction, but potentially by aborting an instruction in progress. The forced link bit will be set in the out stream's status word.

**RAISES** 

(nothing)

(LOAD\_SEGMENT dist\_en m\_type unit limit base)

$$\frac{11}{56} \underset{56}{0} dist\_en \underset{54}{m\_type} \underset{32}{0} unit \underset{40}{0} \underset{39}{limit} \underset{24}{0} base_{0} I$$

LoadSegment — immediate data

This operation loads an address translation descriptor to be used by the memory load stream for fetching data. The format is the same as data map entries in the processor except that the load and store levels and locked bit are omitted.

There are no exceptions for the LOAD\_SEGMENT opcode.

RAISES

(nothing)

LOAD\_SEGMENT\_

#### (LOAD\_ERR\_OFFSET)

Status.pc - Error offset

This operation always links. If there are no masking exceptions, this operation stores the offset of the load request which first resulted in an uncorrectable\_error, limit\_error, or unimplemented\_address exception in the pc field of the link status word. The status\_link flag is set. If multiple errors are encountered, only the first is reported. Note that the offset returned may not be the lowest offset which resulted in an exception. If no errors were present, LOAD\_ERR\_OFFSET returns the next offset that load stream will issue to the network. This operation is used for diagnosing the failure of a load instruction.

RAISES

status\_link

(LOAD\_FLUSH)

13,00000000000000000

I

Flush outbound data buffer

The flush operation clears all data in the outbound data buffer. This operation should be used only when the HIPPI out stream is not connected. If the flush operation is used while the HIPPI out stream is connected, the connection will be dropped.

**RAISES** 

(nothing)

LOAD\_FLUSH\_

Ι

#### (LOAD\_END\_PACKET)

15,00000000000000

Mark end of indeterminate length packet

The end\_packet operation indicates to the out stream that the contents of the buffer are the last words of an indeterminate length packet. This command should only be used with an OUT\_PACKET command of length 0. The load stream will wait until the outbound buffer is emptied or the OUT\_PACKET command otherwise terminates before continuing with the execution of the next instruction.

RAISES

(nothing)

Load the 64-bit data and state beginning at the start\_offset and continuing through all words up to end\_offset minus one, inclusive. The low-order three bits of the start\_offset and end\_offset are ignored. Thus, byte addresses may be used without need for shifting.

The load stream interprets the end\_offset modulo 2<sup>28</sup>. To include the last word in a segment, a 100000000<sub>16</sub> or 0 may be used as the end\_offset.

All forwarding, data traps. and full/empty operations are disabled during the load operation. The load stream issues LOAD\_STATE request to the memory resources, which respond with both the data and the access state stored at the given address.

If JMAGE is used, the outgoing data buffer packs the control access fields for sixteen consecutive data words into a 64-bit state word and inserts the state word into the data stream after its respective data words. When JMAGE is used, the number of words to load from memory must be a multiple of 16. In addition, the number of non-packed data words loaded from memory before the JMAGE operation occurs must be a multiple of 16. Note that these operations may be used together to build a packet with a DATA header and IMAGE payload.

This instruction need not abort due to a forced link unless there are no free buffers into which to load.

This instruction will wait indefinitely for free space in the outbound buffer. Exceptions for the LOAD\_DATA/IMAGE operations are forced\_link, uncorrectable\_error, limit\_error, or unimplemented\_address.

#### RAISES

forced\_link. uncorrectable\_error, limit\_error, unimplemented\_address

| (STORE_LINK) Link Store stream | 80,000000000000000000000000000000000000 | I |
|--------------------------------|-----------------------------------------|---|
| (STORE_LINK_IN) Link In stream | 84,000000000000000000000000000000000000 | I |

The STORE\_LINK operation forces the memory store stream to perform a link operation with the device driver. The link operation pair for the memory store stream is located at word offset 4 and 5 of the program segment for IOP code. No exception is raised.

The STORE\_LINK\_IN operation forces the in stream to perform a link operation. The current instruction of the in stream may be interrupted. The forced link bit will be set in the in stream's status word.

**RAISES** 

(nothing)

(STORE\_SEGMENT dist\_en m\_type unit limit base)

81 0 dist\_en m\_type 0 unit 0 limit 0 base I

StoreSegment — immediate data

This operation loads an address translation descriptor to be used by the memory store stream for writing data. The format is the same as data map entries in the processor except that the load and store levels and locked bit are omitted.

**RAISES** 

(nothing)

STORE\_SEGMENT\_

(STORE\_REPLICATE start\_offset end\_offset)

9 start\_offset end\_offset

Ι

Store fill data

This operation is only valid while the IOP is in loopback mode. The LOAD stream should fetch the word that is to be replicated. The STORE stream duplicates the item and its access state to all the locations between start\_offset and end\_offset minus one. inclusive. The low-order three bits of the start\_offset and end\_offset are ignored. Thus, byte addresses may be used without need for shifting. If multiple items are fetched by the LOAD stream, only the first item is removed from the outbound buffer.

The store stream interprets the end offset modulo 2<sup>28</sup>. To include the last word in a segment, a 10000000<sub>16</sub> or 0 may be used as the end offset.

This instruction need not abort due to a forced link unless the inbound buffer is completely empty.

This instruction will wait indefinitely for data to appear in the inbound buffer. Exceptions for the STORE\_REPLICATE operations are uncorrectable\_error, limit\_error or unimplemented\_address.

#### RAISES

uncorrectable\_error, limit\_error, unimplemented\_address

### (STORE\_ERR\_OFFSET)

<sub>64</sub>83<sub>5</sub>000000000000000000

I

Status.pc — Error offset

This operation always links. If there are no masking exceptions, this operation stores the offset of the store request which first resulted in an uncorrectable\_error, unimplemented\_address, packet\_end, or limit\_error exception in the pc field of the link status word. The status\_link flag is set. If multiple errors are encountered, only the first is reported. Note that the offset returned may not be the lowest offset which resulted in an exception. If no errors were present, STORE\_ERR\_OFFSET returns the next offset that the store stream would have issued to the network. This operation is used for diagnosing the failure of a store instruction.

**RAISES** 

status\_link

STORE\_ERR\_

Ι

## (STORE\_FLUSH)

Flush inbound data buffer

The flush operation clears all data in the inbound data buffer. This operation should only be used when the HIPPI input stream is not receiving a packet. If the flush operation is used while the HIPPI input stream is receiving data, parts of the incoming data packet may be lost.

**RAISES** 

(nothing)

## (STORE\_END\_PACKET)

85,00000000000000

I

Handle packet end clean up

This operation flushes all data for the current packet that has not already been stored, finishing when end of packet is received. Thus, fill data at the end of a packet can be disposed of using this operation. At the extreme, a whole packet of data will be flushed if no STORE\_DATA or STORE\_IMAGE operations are placed between successive STORE\_END\_PACKET operations.

In addition, the uncorrectable\_error exception is raised if an uncorrectable\_error was detected during the reception of this packet by the in stream. Note that a STORE\_END\_PACKET must be issued for each packet received by the HIPPI IN stream. This operation always links on completion.

RAISES

uncorrectable\_error

STORE\_END\_PACKET\_

I

## (STORE\_END\_SEGMENT)

Validate partial packet data

This operation ensures that the uncorrectable\_error exception is raised if an uncorrectable\_error was detected during the reception of any of the data for this packet which has already been stored. Thus, if the end of a burst has not yet been received, but the initial data in the burst has been stored, this operation will wait until the LLRC check at the end of the burst has been performed. This operation always links on completion.

RAISES

uncorrectable\_error

Store the 64-bit data and state beginning at the start\_offset and continuing through all words up to end\_offset minus one, inclusive. If an end of packet is signaled before all the indicated words are received and stored, the packet\_end exception will be raised. The low-order three bits of the start\_offset and end\_offset are ignored. Thus, byte addresses may be used without need for shifting.

The store stream interprets the end offset modulo 2<sup>28</sup>. To include the last word in a segment, a 100000000<sub>16</sub> or 0 may be used as the end offset.

When DATA is used, the access control states will be set to full, no forwarding, no traps. When IMAGE is used, the access control states are unpacked from the inbound buffer. When STORE-IMAGE is used, the number of words to store to memory must be a multiple of 16. The store stream removes every 17th word and uses the data contained in it to generate the access control states for the preceding 16 words. An uncorrectable error may be caused by failure of the inbound buffer or bad incoming HIPPI data. Note that these operations may be used together to scatter store a packet with a DATA header and IMAGE payload.

These instructions will wait indefinitely for data to appear in the inbound buffer. Exceptions for the STORE\_DATA/IMAGE opcodes are uncorrectable\_error, unimplemented\_address, packet\_end, or limit\_error. The unimplemented address and limit\_error exceptions should only occur due to program errors.

#### RAISES

uncorrectable\_error. unimplemented\_address. packet\_end, limit\_error

| (OUT_LINK)       | 41 000000000000000000000000000000000000 | Ι |
|------------------|-----------------------------------------|---|
| Link Out stream  | 30 0                                    |   |
| (OUT_LINK_LOAD)  | 48,0000000000000                        | I |
| Link Load stream | 30 0                                    |   |

The OUTLINK operation forces the HIPPI output stream to perform a link operation with the device driver. The link operation pair for the HIPPI output stream is located at word offset 2 and 3 of the program segment for IOP code. No exception is raised.

The OUT\_LINK\_LOAD operation forces the load stream to perform a link operation. The current instruction of the load stream may be interrupted. The forced link bit will be set in the load-stream's status word.

(OUT\_RING swap width timeout Ifield)

 $_{64}^{2}$   $_{60}^{2}$   $_{50}^{3}$   $_{57}^{3}$  width  $_{56}^{4}$  timeout  $_{32}^{2}$  I field  $_{0}$ 

Request a connection on the HIPPI interface

The HIPPI request signal is asserted and the 32-bit I-field is placed on the HIPPI data path bits 31 to 0. All zeros are placed on the remaining HIPPI data bits 63 to 32. The connect signal must initially be deasserted before request can be asserted. If the connect signal was asserted when the OUT\_RING operation is executed a bad\_connect exception is raised.

This instruction waits until the connect signal is asserted by the destination or timeout occurs. Once the connect signal is asserted, the IOP stops transmission of the I-field. The stream then tries to detect a rejected connection. If during this time a ready pulse is received, the stream assumes the connection was accepted. If the connect line is deasserted before a ready pulse is received, the connection was rejected and a connect\_lost exception is generated.

The width field selects either a 32- or 64-bit channel width. When width is asserted, the IOP operates in 64-bit HIPPI mode. When width is deasserted, the order of the 32-bit words may be reversed by setting the swap bit.

Exceptions for the OUT\_RING opcode are timeout, connect\_lost, interconnect\_lost or bad\_connect.

#### RAISES

timeout, connect\_lost, interconnect\_lost, bad\_connect

I

(OUTLOOPBACK timeout)

4A timeout 00000000

Request loopback connection

The HIPPI-output stream indicates to the HIPPI-input stream that it wishes to be in loopback mode. If the HIPPI-input stream acknowledges with a similar INLOOPBACK command before timeout occurs. OUTLOOPBACK mode is established. When the IOP is in loopback mode, any data and access control state present in the outbound buffer is available for transfer into the inbound buffer. Loopback mode continues until terminated by the HIPPI-input stream or via LOADLINK\_OUT.

The exceptions for the OUTLOOPBACK opcode are timeout, connect lost, and force link.

RAISES

timeout, connect\_lost, force\_link

(OUT\_LOOPMODE)

4A 56 000001 32 000000000 1 0

Select local serial loopback

The HIPPI-output stream indicates serial link logic that it wishes to be in local serial loopback mode. This mode remains in effect until cleared by a subsequent OUT\_LOOPBACK instruction.

**RAISES** 

(nothing)

OUT\_LOOPMODE\_

#### (OUT\_DISCONNECT timeout)

44 timeout 00000000 ]

Break connection

This operation completes the HIPPI connection by deasserting the request signal and waiting for the destination to acknowledge by deasserting the connect signal.

If a disconnect is issued while the IOP is in loopback mode, both the HIPPI input and output streams will exit loopback mode.

The exception for the OUT\_DISCONNECT opcode is timeout.

(OUT\_CANCEL timeout)

45 timeout 00000000

Wait for late connect response

Wait for the connect signal to be asserted or for timeout to occur. After a RING fails, this instruction may be used to wait up to a round trip delay for a late connect response to the previous request assertion.

The exception for OUT\_CANCEL is timeout.

**RAISES** 

timeout

OUT\_CANCEL\_

(OUT\_INTERCONNECT width timeout)

27 width timeout 00000000 I

Wait for interconnect asserted

Wait for the interconnection lines for the specified HIPPI width to settle to an active state or for timeout to occur.

The exception for the OUT\_INTERCONNECT opcode is timeout.

RAISES

timeout

47 timeout 00000000 0

(OUT\_DELAY timeout)

Wait

Wait for timeout to occur, then fetch the next instruction.

There are no exceptions for OUT\_DELAY.

**RAISES** 

(nothing)

OUT\_DELAY\_

I

(OUT\_RESET timeout)

47 timeout 00000001

Send RESET to serial link

Assert RESET to the serial link for the timeout period, then fetch the next instruction.

There are no exceptions for OUT\_RESET.

**RAISES** 

(nothing)

#### (OUT\_PACKET timeout size)

49 timeout size I

Send packet

This operation sends a packet containing a number of 256 word bursts, and one final short burst, if needed, out on the HIPPI interface.

The number of eight-bit bytes to send is specified by the *size* field, which must be a multiple of eight (*size* is 8 \* words). If LOAD\_IMAGE is used, the number of words to fetch from memory must be a multiple of 16, and *size* is calculated by (17 \* words)/2.

The maximum packet length is  $2^{32} - 8$  bytes. A zero length packet will cause the out stream to transmit an indeterminate length packet. A indeterminate length packet is completed by the LOAD stream issuing a LOAD\_END\_PACKET instruction. When the out stream receives the end packet indication from the load stream it empties the outbound buffer and then deasserts the packet signal. The instruction then completes and the next operation is fetched. Note that this handling only allows one indeterminate length packet to be present in the outbound buffer at a time.

Output flow control is handled automatically by the IOP. The memory load stream produces data into the outbound data buffer and the output stream sends the data out as allowed by the ready signals from the destination. The out stream will wait indefinitely for data to appear in the outbound buffer. Errors in the buffer data will raise the uncorrectable error exception. When such bad data is encountered, the LLRC of the burst will be corrupted to guarantee that the receiver discards the packet.

During the OUT\_PACKET operation, a LOAD\_LINK\_OUT operation will only allow any bursts already in the outbound buffer to be sent. If there are no more bursts present and the OUT\_-PACKET operation has not completed, the packet will be truncated with a forced link exception. Note that this will cause a packet shorter than the encoded size to be transmitted.

If the connection is lost, the operation is aborted and a connect\_lost exception is raised. If the connection was not present before packet is asserted, then the no\_connection exception will be raised instead of the connect\_lost exception. The timeout field specifies the maximum time to wait for a ready pulse when a burst is ready to send, but the ready counter is zero.

Exceptions for the OUT\_PACKET opcode are timeout, uncorrectable\_error, no\_connection, or connect\_lost.

#### RAISES

timeout, uncorrectable\_error, no\_connection, connect\_lost

OUT\_PACKET\_

| (IN_LINK)         | C1 5000000000000000000000000000000000000 | I |
|-------------------|------------------------------------------|---|
| Link In stream    |                                          |   |
| (IN_LINK_STORE)   | CB 000000000000000000000000000000000000  | I |
| Link Store stream |                                          |   |

The IN\_LINK operation forces the HIPPI input stream to perform a link operation with the device driver. The link operation pair for the HIPPI input stream is located at word offset 6 and 7 of the program segment for IOP code. No exception is raised.

The IN\_LINK\_STORE operation forces the store stream to perform a link operation. The current instruction of the store stream may be interrupted. If the store stream is executing a STORE\_DATA, STORE\_IMAGE, or STORE\_REPLICATE operation, it will delay the link operation until the in buffer is empty. The forced link bit will be set in the store stream's status word.

(IN\_LISTEN timeout)

C5\_timeout\_00000000

Ι

Wait for connection request

This operation waits until a connection is requested or a timeout occurs. The hardware continuously monitors the REQUEST signal, and registers a connection request when the signal transitions from deasserted to asserted. If a timeout occurs, the bad\_connect exception is raised if REQUEST is asserted, but has not transitioned to deasserted since the last disconnection. When REQUEST is deasserted, timeout simply raises the timeout exception.

If no exception occurs, the current value on the HIPPI input data pins is stored in the pc field of the link status word and the status link flag is set. The IN stream then waits for a new pc, completing a link operation.

While an IN\_LISTEN is waiting for a connection request, it will immediately abort with a forced link exception upon executing STORE\_LINK\_IN in the STORE stream.

The exceptions for the IN\_LISTEN opcode are timeout, bad\_connect, or interconnect\_lost.

#### RAISES

timeout, bad\_connect, interconnec\_lost, status\_link

IN\_LISTEN\_

(IN\_REJECT)

I

Reject connection request

This instruction asserts the connect signal, then deasserts it after eight HIPPI clock cycles. If the request signal is already deasserted, connect is not asserted and the connect lost exception is raised. The exception to the IN\_REJECT opcodes is connect lost.

RAISES

connect\_lost

(IN\_ACCEPT swap width timeout)

61 swap 60 4 width timeout 300000000

I

Accept connection request

This instruction asserts the connect signal. If the request signal is deasserted, connect is not asserted and the connect lost exception is raised.

The width field indicates the channel width of the HIPPI in stream. When width is asserted, a 64-bit HIPPI channel is used. If no bursts are received within a timeout period, the timeout exception is raised. When width is deasserted, the order of the 32-bit words may be reversed by setting the swap bit.

Input flow control is handled automatically by the IOP. A ready indication will only be signaled to the source when the IOP can guarantee buffer space (in the IOP or in memory) to hold the enabled burst. The in stream will wait indefinitely for free space in the inbound buffer.

If a parity error or LLRC error is detected within a burst of this packet, then an error flag is associated with this burst, the connection is dropped (ending the packet), and the store stream will take an uncorrectable error exception after storing the data.

If a burst is received with more than the maximum 256 words allowed, then an error flag is associated with this burst, the connection is dropped (ending the packet), and the store stream will take an uncorrectable error exception after storing the data.

If a zero length packet is received, the connection is dropped and the connect\_lost exception is raised. This instruction is only terminated via exceptions: connect\_lost, uncorrectable\_error, timeout, or force\_link. In all cases, the connection is dropped when this instruction terminates. If a partial packet has been placed in the inbound buffer, a packet end is marked on an exception. [There is some race condition here that escapes me.]]

The exceptions to the IN\_ACCEPT opcodes are connect\_lost, uncorrectable\_error, and timeout.

RAISES

connect\_lost. uncorrectable\_error, timeout

IN\_ACCEPT\_

I

#### (IN\_LOOPBACK timeout)

C3 timeout 00000000 0

Enter loopback mode

The HIPPI-input stream indicates to the HIPPI-output stream that it wishes to be in loopback mode. If the HIPPI-output stream acknowledges with a similar OUT\_LOOPBACK command before timeout occurs, loopback mode is established. When the IOP is in loopback mode, any data and access control state present in the outbound buffer is available for transfer into the inbound buffer.

This operation is only terminated via exceptions. The connect\_lost exception is raised when the connection is terminated by the out stream executing OUT\_DISCONNECT. In addition, IN\_LOOPBACK may be aborted with a STORE\_LINK\_IN operation in the store stream.

The exceptions for the IN\_LOOPBACK opcode are timeout, and connect\_lost.

RAISES

timeout, connect\_lost

(IN\_INTERCONNECT width timeout)

67 width timeout 32 00000000

I

Wait for interconnect asserted

Wait for the interconnection lines for the specified HIPPI width to settle to an active state or for timeout to occur.

The exception for the IN\_INTERCONNECT opcode is timeout.

**RAISES** 

timeout

IN\_INTERCONNECT\_

(IN\_DELAY timeout)

Wait

Wait for timeout to occur, then fetch the next instruction.

There are no exceptions for the INDELAY opcode.

RAISES

(nothing)

## Chapter 15: I/O Processor Examples

## 15.1 Loading Memory

<Load\_Image>≡

The following code fragment may be used as a template for loading a segment of memory into the outbound data buffer. If an error occurs during the loading of data from the network, or the HiPPI out stream encounters an exception during the output command for this load, the driver will be notified by the link address. The notification will indicate that the pc-offset for the load stream is at offset 1 relative to the start of the code fragment, and the exception bit in the SSW will be set, along with the error bit that caused the exception.

```
<Load_Segment>≡
(LOAD_SEGMENT dist_en mem_t unit limit base)
(LOAD_DATA start_off end_off)
(LOAD_LINK)
```

The following code fragment loads a header of upper layer protocol in unpacked format into the outbound buffer, and then loads the actual data in an image format into the outbound buffer.

```
;Segment for ULP data
(LOAD_SEGMENT ULPdist_en ULPmem_t ULPunit ULPlimit ULPbase)
(LOAD_DATA ULPstart ULPend)
;Segment for Dataset
(LOAD_SEGMENT dist_en mem_t unit limit base)
(LOAD_IMAGE start_off end_off)
(LOAD_LINK)
```

#### 15.2 Sending Data

The following code fragment may be used as a template for making a 64 bit connection to the external HiPPI domain. When the link operation occurs, the driver must look at the exception bit in the SSW to determine whether the operation succeeded or no external device responded to the connection request.

```
<Make_Ring>≡
(OUT_RING 1 1000 Ifield)
(OUT_LINK)
```

If the Make\_Ring code fragment fails on the RING instruction then the IOP must allow for a spurious connect signal. The following code fragment waits for an entire connect pulse to occur or for two milliseconds, which ever occurs first.

```
<Break_Ring>≡

(OUT_CANCEL 2000)
(OUT_DISCONNECT 100)
(OUT_LINK)
```

The following code fragment may be used to transfer a packet on the HiPPI channel. The number of words loaded by the load stream is length. If the load stream used LOAD\_IMAGE, the length must be multiplied by 17/16 in order to compensate for the extra state words packed into the outbound buffer.

```
<Output_Packet>≡
(OUT_PACKET 100 length)
(OUT_LINK)
```

## 15.3 Receiving Data

The following code fragment returns an I-field if an external device tries to connect with the IOP before a timeout occurs. The Ifield will be stored in the pc field of the status link word.

```
<Listen_For_Request>≡
(IN_INTERCONNECT 10)
(IN_LISTEN 10000)
```

The following code fragment receives a sequence of packets from the HiPPI port. The connection is normally terminated by the sender, but can be broken by a STORE\_LINK\_IN operation.

```
<Receive_Packets>≡

(IN_ACCEPT 1 100)
```

### 15.4 Storing Memory

The following code fragment stores the upper layer protocol header in one area and the data information in an I/O buffer.

```
<Receive_Image>≡
;ULP data area
 (STORE_SEGMENT ULPdist_en ULPmem_t ULPunit ULPlimit ULPbase)
 (STORE_DATA ULPstart ULPend)
;File system IO area
                                                                            5
 (STORE_SEGMENT dist_en mem_t unit limit base)
 (STORE_IMAGE start_off end_off)
 (STORE_LINK)
 < iopexamples.asm > \equiv
< Load_Segment>
<Load_Image>
<Make_Ring>
<Break_Ring>
<Output_Packet>
< Listen_For_Request>
< Receive_Packets>
< Receive_Image>
```

## Appendix A: Operation Encoding Summary

This chapter shows the encoding for every operation. The first column contains 64 sub columns, one for every bit in an instruction word. numbered from right to left. The second column is the assembly language prototype. Within the first column, a symbol "-" indicates that the particular bit is not used. A symbol "0" or "1" indicates a literal value encoding the operation. An alphabetic symbol shows where bits encoding an operand occur. If the operand name is "xyz", then the first letter "x" is repeated to fill the object code field. A "\*' indicates an operand which is not used by the particular operation (don't care).

#### A.1 M OPs

| ITTTTSSSSS0000 | (LOAD_FE r s)   |
|----------------|-----------------|
| rrrrsssss0001  | (INTLOADH r s)  |
| ITTTTS85550010 | (INTLOADQ r s)  |
| ITITIS5SSS0011 | (INTLOADB r s)  |
|                | (LOAD r s)      |
|                | (UNS_LOADH r s) |
| <del></del>    | (UNS_LOADQ r s) |
| 0000000000110  | (NOP)           |
|                | (UNS_LOADB r s) |
|                | (STORE r s)     |
|                | (STOREH r s)    |
| ITITISSSSS1010 | (STOREQ r s)    |
| TTTTTSSSSS1011 | (STOREB r s)    |

#### A.2 MC OPs

```
-----dddddddddddddddddl0 (INT_LOADB_DISP r s disp)
---rrrrrssssss1100-----aanaaddddddddddddd11 (INTLOADB_AC_DISP r s ac disp)
---rrrrsssss1100------ddddddddddddddddddd100 (INTLOADQ.DISP r s disp)
               aaaaadddddddddddd101 (IHT_LOADQ_AC_DISP r s ac disp)
---rrrrssssss1100----
                 -----dddddddddddddddl000 (INT_LOADH_DISP r s disp)
aaaaaddddddddddddd1001 (INT_LOADH_AC_DISP r s ac disp)
---riiirsssss1100---
                  ddddddddddddddddddl0000 (IET_FETCH_ADD_DISP r s disp)
--- rrrrrssssss1100---
dddddddddddddddddoooo (STATELLOCK_DISP r s disp)
               -----aaaaaddddddddddddd00001 (STATE_LDCK_AC_DISP r s ac disp)
               -----dddddddddddddddddl0 (UNS_LOADB_DISP r s disp)
 -rrrrsssss1101----
               -----aaaaaddddddddddddl: (UWS_LOADB_AC_DISP r s ac disp)
-rrrrrssssss1101--
               disp)
              asasaddddddddddddl01 (UHS_LOADQ_AC_DISP r s ac disp)
----rrrrsssss1101----
-TITITSSSSS1101-----dddddddddddddddddd000 (UMS_LOADH_DISP r s disp)
-----dddddddddddddddd10000 (LOAD_DISP r s disp)
                -----aaaaadddddddddddd10001 (LOAD_AC_DISP r s ac disp)
                 ----dddddddddddddddddd00000 (REG_LDAD_DISP r s disp)
                        aaaaaddddddddddddd00001 (REG_LDAD_AC_DISP r s ac disp)
--dddddddddddddddddl0 (STOREE_DISP r s disp)
```

```
---rrrrsssss1110------ddddddddddddddddd100 (STOREQ_DISP r s disp)
---rrrrrsssssili0-----aaaaadddddddddddd101 (STOREQ_AC_DISP r s ac disp)
---rrrrsssss1110-----dddddddddddddddd1000 (STOREH DISP r s disp)
----riffrsssss1110-----aaaaaddddddddddddd10001 (STORE_AC_DISP r s ac disp)
---rrrrrsssss1110------dddddddddddddddddddddddd00000 (STATE_STORE_DISP r s disp)
-IIIIIsssss1110------00000ddddddddddd00001 (STATE_STORE_ERROR_DISP r s disp)
lla**yyyy00000111001 (PROBE_INDEX r s lev access y)
---- TTTTTSSSSSS1111------
    aaaaayyyyy00001111001 (REG_STORE_AC_INDEX r s ac y)
-TTTTTSSSSSS1111----aaaaayyyyy01010111001 (UNS_LOADH_AC_INDEX r s ac y)
aaaaayyyyy01100111001 (UNS_LOADQ_AC_INDEX r s ac y)
aaaaayyyyy01110111001 (UNS_LOADB_AC_INDEX r s ac y)
--- reference aaaaayyyyy10000111001 (INT_FETCH_ADD_AC_INDEX r s ac y)
aaaaayyyyy10010111001 (INT_LOADH_AC_INDEX r s ac y)
----aaaaayyyyy10110111001 (INT_LOADB_AC_INDEX r s ac y)
--- aaaaayyyyy11000111001 (STORE_AC_INDEX r s ac y)
       --- TTTTTSSSSS1111-----
    -aaaaayyyyy110011111001 (STATE_STORE_AC_INDEX r s ac y)
```

| rrrrsssss1111 | -aaaaayyyyy11100111001 | (STOREQ_AC_INDEX r s ac y)       |
|---------------|------------------------|----------------------------------|
| rrrrsssss1111 | -aaaaayyyyy11110111001 | (STOREB_AC_INDEX r s ac y)       |
| rrrrsssss1111 | -dddddddddddd01000     | (STATE_SCRUB_DISP r s disp)      |
| rrrrsssss1111 | -dddddddddddddd10000   | (STATE_LOAD_DISP r s disp)       |
| rrrrsssss1111 | -lla==ddddddddddd10001 | (PROBE_DISP r s lev access disp) |
| rrrrsssss1111 | -dddddddddddddddd00000 | (REG_STORE_DISP r s disp)        |
| rrrrsssss1111 | -aaaaaddddddddddd00001 | (REG_STORE_AC_DISP r s ac disp)  |

#### A.3 A OPs

```
----- (BREAK)
----- (STREAM_QUIT)
 ----- (STREAM_QUIT_PRESERVE)
----- (STREAM_CATCH r t x delay str)
----- (DOMAIN_IDENTIFIER_SAVE t)
------ttttt01000=====00001000000------ (STREAM_IDENTIFIER_SAVE t)
----- (STREAM_CUR_SAVE t)
----- (STREAM_RES_SAVE t)
------ (REG_MOVE t v)
----- ttttt01010mmmmmmmcc0000000----- (LOGICAL_ONE t mask cn)
----- (LOGICAL_ALLOWE t mask cn)
----- (BIT_MASK t top bot)
------ (FLOAT_SCALE t v m)
 ------tttt11010vvvvvvvvvvv000000------ (INT_SHIFT_RIGHT t v v)
 ----- (INT_RECIP_ERROR t v v)
  ------ (FLOAT_RECIP_ERROR t v v
   ------ (LDGICAL_DNE_TEST t mask cn)
 ------- (LOGICAL ALLOHE_TEST t mask cn)
   ----- (INT_SHIFT_RIGHT_TEST t v v
------ (INT_IMM t value)
 ----- (NDP)
-----tttttuuuuuvvvviiicc000100------
                             ---- (SELECT_INT t u v intselect cn)
------ (SELECT_INT_TEST t u v intselect cn)
------ (SELECT_FLOAT t u v floatselect cn)
 ------ (SELECT_FLOAT_TEST t u v floatselect
CD)
 ------ (STREAM_RESERVE t u st)
    -----tttttuuuuu01ssssssss01000------(STREAM_RESERVE_UPTO t u st)
    -----tttttuuuuu1000ssssss011000-------(SHIFT_LEFT_IMM t u sh)
  ------ttttuuuuu110000000001000---------
                              - (FLOAT_NEAR t u)
  ----- (FLOAT_CHOP t u)
 -----ttttuuuuu110000010001000-------
                          (FLOAT_FLOOR t u)
  ------ (FLOAT_CEIL t u)
------ (INT_NEAR t u)
 ------ (INT_CHOP t u)
  ------tttttuuuuu1100000110001000------ (IFT_FLOOR t u)
 -----ttttuuuuu1100000111001000---
                            ---- (INT_CEIL t u)
   ------ (UNS_NEAR t u)
  -----tttttuuuuu1100001001001000------(U3S_CHDP t u)
  ------tttttuuuuu1100001010001000------ (UHS_FLOOR t u)
     -----tttttuuuuu1100001011001000------ (UES_CEIL t u)
    -----tttttuuuuu1100001100001000------ (FLDAT_ROUND t u)
     ----tttttuuuuu1100001101001000----- (INT_ROUND t u)
    -----tttttuuuuu1100001110001000------ (URS_ROUED t u)
```

```
------ (FLOAT_REAL t u)
  -----tttttuuuuu1100011110001000------ (FLOAT_INT t u)
      -----tttttuuuuu11000111111001000----- (FLOAT_UNS t u)
   -----tttttuuuuu11010aaaaa001000----- (PTR_SET_AC t u ac)
     ------tttttuuuuu1101100000001000--------- (BIT_MAT_TRANSPOSE t.u)
  ------ttttt*****1101110000001000------ (COUNT_ISSUES t)
  ------ (COUNT_MEMREFS t)
     -----ttttt*****1101110010001000------ (COUNT_STREAMS t)
    ------ttttt=====11011101ee001000----------(COUNT_EVENTS t ec)
      -----ttttt======11011111000001000------ (COUNT_PHANTOMS t)
      -----ttttt******1101111001001000------- (COUNT_READY t)
        ---ttttt-----11011111100001000------- (COUNT_SELECT_SAVE t)
  ------ (STREAM_RESERVE_TEST t u st)
      -----tttttuuuuu01ssssssss001001------ (STREAM_RESERVE_UPTO_TEST t u st)
      -----tttttuuuuu1000ssssss001001----- (SHIFT_LEFT_LMM_TEST t u sh)
  -----ttttuuuuu110000010001001------
                                   ---- (INT_NEAR_TEST t u)
  -------tttttuuuuu1100000101001001------ (IHT_CHOP_TEST t u)
  ------tttttuuuuu1100000110001001----- (INT_FLOOR_TEST t u)
     -----tttttuuuuu1100000111001001------ (INT_CEIL_TEST t u)
     ------tttttuuuuu110000100001001------ (UHS_NEAR_TEST t u)
       ----tttttuuuuu1100001001001001----- (UNS_CHOP_TEST t u)
   ------tttttuuuuu1100001010001001 ------ (UES_FLOOR_TEST t u)
     -----tttttuuuuu1100001011001001-------- (UNS_CEIL_TEST t u)
     -----tttttuuuuu1100001101001001------ (INT_ROUND_TEST t u)
   ------tttttuuuuu1100001110001001------ (UNS_ROUND_TEST t u)
      ------ (PROGRAM_STATE_RESTORE u)
       ------ (PROGRAM_MAP_FLUSH_ANY u)
     ------ (PROGRAM_CACHE_FLUSH u)
       ------ (PROGRAM_CACHE_FLUSH_AHY u)
     ------ (EXCEPTION_RESTORE u)
     ----- (SSW_RESTORE u)
       ------ (DOMAIN_LEAVE u)
     ----- (DOMAIN_ENTER)
      ----- (COUNT_SELECT_RESTORE u)
     -----ttt01uuuuu+++++++001010------ (TARGET_RESTORE tn u)
     -----ttt10oooooooooooooo001010----- (TARGET_DISP tn offset)
     -----ttt11uuuuu0000000000001010------ (TARGET_INDEX tn u)
      ----tttttuuuuuooooooooooooo001011----- (STREAM_CREATE_IMM r t u x y offset)
      ----tttttuuuuuvvvv00000001100----- (BIT_NIMP t u v)
       ----tttttuuuuuvvvvv00001001100----- (BIT_AND t u v)
     -----tttttuuuuuvvvvv00010001100------ (BIT_XOR t u v)
      ----tttttuuuuuvvvv00011001100----- (BIT_DR t u v)
        -----tttttuuuuuvvvvv00101001100------ (BIT_XNOR t u v)
     -----tttttuuuuuvvvv00110001100----- (BIT_NAED t u v)
     -----tttttuuuuuvvvvv00111001100----- (BIT_IMP t u v)
   -----tttttuuuuuvvvv01000001100----- (BIT_DDD_NIMP t u v)
   ------tttttuuuuuvvvvv01010001100------ (BIT_ODD_XOR t u v)
  ------ (BIT_DDD_DR t u v)
  ------tttttuuuuu0000001111001100------- (BIT_TALLY t u)
    ------tttttuuuuuvvvvv10000001100------ (BIT_MAT_OR t u v)
  ------tttttuuuuuvvvvv10001001100------ (BIT_MAT_XOR t u v)
------tttttuuuuuvvvvv1010101100------------------ (BIT_UNPACK_1 t u v)
```

| •                                           |                                         |
|---------------------------------------------|-----------------------------------------|
| tttttuuuuuvvvvv10110001100                  | - (BIT_UNPACK_2 t u v)                  |
| tttttuuuuuvvvvv10111001100                  | - (BIT_UNPACK_3 t u v)                  |
| ttttuuuuuvvvvv00000001101                   |                                         |
| ttttuuuuvvvvv00001001101                    |                                         |
| ttttuuuuuvvvvv00010001101                   |                                         |
| ttttuuuuuvvvvv00011001101                   |                                         |
| ttttuuuuuvvvv00100001101                    |                                         |
| tttttuuuuuvvvv00101001101                   |                                         |
| tttttuuuuuvvvvv00110001101                  |                                         |
| tttttuuuuuvvvvv00111001101                  |                                         |
| tttttuuuuuvvvv01000001101                   |                                         |
| tttttuuuuuvvvv01001001101                   |                                         |
| ttttuuuuuvvvv01010001101                    |                                         |
| tttttuuuuvvvvv01011001101                   | (PIT OND OP TEST + " ")                 |
| ttttuuuuu0000001111001101                   |                                         |
|                                             |                                         |
| tttttuuuuvvvvv10000001110                   |                                         |
| tttttuuuuvvvvv10001001110                   |                                         |
| tttttuuuuvvvvv10010001110                   | (FLUAT_MIN t u v)                       |
| tttttuuuuvvvvv10011001110                   |                                         |
| tttttuuuuuvvvvv10100001110                  |                                         |
| ttttuuuuuvvvvv10101001110                   |                                         |
| ttttt*****vvvv11001001110                   |                                         |
| tttttuuuuuvvvv11100001110                   |                                         |
| tttttuuuuuvvvv11101001110                   | (INT_SUB t u v)                         |
| tttttuuuuuvvvvv11110001110                  | (INT_MIN t u v)                         |
| tttttuuuuuvvvvv11111001110                  | (INT_MAX t u v)                         |
| tttttuuuuvvvv10001001111                    | (FLOAT_CMP_TEST t u v)                  |
| tttttuuuuuvvvvv10010001111                  | (FLOAT_MIN_TEST t u v)                  |
| tttttuuuuuvvvvv10011001111                  | (FLOAT_MAX_TEST t u v)                  |
| tttttuuuuuvvvvv10100001111                  | (FLOAT_MMIN_TEST t u v)                 |
| tttttuuuuuvvvvv10101001111                  | (FLOAT_MMAX_TEST t u v)                 |
| tttttuuuuuvvvvv11100001111                  | (INT_ADD_TEST t u v)                    |
| tttttuuuuuvvvvv11101001111                  | (INT_SUB_TEST t u v)                    |
| tttttuuuuuvvvvv11110001111                  |                                         |
| tttttuuuuuvvvvv11111001111                  |                                         |
| ttttuuuuuvvvvvvvvvv010100                   |                                         |
| ttttuuuuuvvvvvvvvv010101                    |                                         |
| tttttuuuuuavavvaaaa010110                   |                                         |
| ttttuuuuuvvvvvaaaa010111                    |                                         |
| tttttuuuuuvvvvvvvvv011000                   |                                         |
| tttttuuuuvvvvvssss011001                    | (INT_DIV CHOP TEST + 11 v v)            |
| ttttuuuuuvvvvvvvvvvv011010tttttuuuuuvvvvvvv | (INT_DIV_FLOOR t u v v)                 |
| tttttuuuuuvvvvvvavav011011                  | (INT DIV FLOOR TEST + " v =)            |
| tttttuuuunvvvvvvvvv011100                   | (INC DIV + v = =)                       |
| ttttuuuuuvvvvvvvvvoonaa011101               | (INS DIV TEST + 11 17 m)                |
| tttttuuuuvvvvvvvvv011110                    | (FIGAT DIV + v v m)                     |
| tttttuuuuuvvvvvvvvv011111                   | (FIGHT CORT +                           |
| ttttuuuuu0vvvvvvv100000                     | (FLUAL_SQRI T U V V)                    |
| ttttuuuuu1vvvvvvv100000                     | (INI_ADD_IAM t u value)                 |
| ttttuuuu0vvvvvv100001                       | (INT_SUB_IMM t u value)                 |
|                                             | (INT_ADD_IMPLTEST t u value)            |
| tttttuuuui1vvvvvvv100001                    | (INI_SUB_IMPLTEST t u value)            |
| tttttuuuuvvvvvvvvvvv0100110                 | (BIT_MERGE t u v v)                     |
| tttttuuuuvvvvvvvvvv 100111                  | (BIT_MERGE_TEST t u v v)                |
| tttttuuuuuvvvvvaasaa101000                  | (INT_ADD_MUL t u v v)                   |
| tttttuuuuuvvvvvvvvavata101001               | (INT_ADD_MUL_TEST t u v v)              |
| tttttuuuuuvvvvvvvvv 101010                  | (INT_SUB_MUL t u v w)                   |
| ttttuuuuuvvvvvvvvvvu 101011                 | (INT_SUB_MUL_TEST t u v v)              |
| ttttuuuuuvvvvvapaaa101100                   | (UNS_ADD_MUL_UPPER t u v v)             |
| ttttuuuuuvvvvvvvvv00000101101               | (UNS_ADD_MUL_UPPER_TEST t u v w)        |
| ·                                           | • • • • • • • • • • • • • • • • • • • • |

| tttttuuuuvvvvvaaaaa101110   | (INT CID MIT DEV -               |
|-----------------------------|----------------------------------|
|                             | (INITODIUNTER E A A)             |
| tttttuuuuvvvvvaaaaa101111   | (INT_SUB_MUL_REV_TEST t u v v)   |
| ttttuuuuuvvvvvaassa 110000  | (FLOAT_ADD_MUL t u v v)          |
| ttttuuuuuvavvvaassa110010   | (FLOAT_SUB_MUL t u v v)          |
| ttttuuuuvvvvvaaaaa110100    | (FLOAT_MUL_LOWER t u v w)        |
| tttttuuuuvvvvvvvvvvvvvvv    | (FLOAT_SUB_MUL_REV t u v v)      |
| tttttuuuuvvvvvvvavava111000 | (FLOAT_ITER t u v v)             |
| -tttttuuuuvvvvaaaaa111010   | (FLOAT_DIV_APPROX t u v v)       |
| tttttuuuuvvvvvaaaa111011    | (FLOAT_SQRT_APPROX_TEST t u v v) |
|                             | (FLOAT_DIV_ERROR t u v v)        |
| tttttuuuuuvvvvvavaan 111101 | (FLOAT_SQRT_ERROR_TEST t u v v)  |
| tttttuuuuvvvvvvvvvv111111   | (FLOAT_RSQRT_ERROR_TEST t u v v) |

#### A.4 C OPs

```
yyyyy00001000000 (FLOAT_APPROX_RESTORE y)
 -----xxxxxyyyyy00010000000 (REG_MOVE x y)
        -----xxxxxtttttt00011000000 (TRAP_SAVE x tr)
  -----xxxxxyyyyy00100000000 (BIT_LEFT_DNES x y)
 xxxxyyyyy01000000000 (INT_RECIP_APPROX x y)
     -----xxxxxyyyyy01100000000 (FLOAT_RECIP_APPROX x y)
   xxxxxyyyyy01111000000 (UNS_RECIP_SHIFT x y)
 XXXXX*****1100000000 (STREAM_COUNT_INST x)
   xxxxx******11100000000 (EXCEPTION_SAVE x)
 xxxxx*****11101000000 (RESULTCODE_SAVE x)
          XXXXX*****11111000000 (STREAM_LOOKAHEAD_SAVE x)
         xxxxxyyyyy00100000001 (BIT_LEFT_ONES_TEST x y)
          xxxxxyyyyy00101000001 (BIT_LEFT_ZEROS_TEST x y)
        xxxxxyyyyy00111000001 (BIT_RIGHT_ZEROS_TEST x y)
            -----xxxxyyyyy01011000001 (INT_LOGB_TEST x y)
            -----xxxxxyyyyy01100000001 (FLOAT_RECIP_APPROX_TEST x y)
          xxxxxyyyyy01101000001 (FLOAT_RSQRT_APPROX_TEST x y)
         -----xxxxxyyyyy01110000001 (INT_RECIP_SHIFT_TEST x y)
        ------xxxxxyyyyyzzzzzz000011 (ROTATE_RIGHT_TEST x y z)
     xxxxxyyyyyvvvvv000101 (INT_ADD_IMM_TEST x y value)
        ------xxxxxyyyyyvv vv000110 (INT_SUB_IMM x y value)
          -----xxxxxyyyyyvvvvv000111 (INT_SUB_IMM_TEST x y value)
          ----xxxxxyyyyyzzzzzz001100 (FLOAT_ADD x y z)
                --xxxxxyyyyyzzzzz001110 (FLOAT_SUB x y z)
         xxxxxyyyyyzzzzz010000 (BIT_NIMP x y z)
```

```
------xxxxxyyyyyzzzzz010010 (BIT_AND x y z)
----xxxxxyyyyyzzzzz010011 (BIT_AND_TEST x y z)
-----xxxxxyyyyzzzzz010100 (BIT_XOR x y z)
  -----xxxxyyyyyzzzzz010101 (BIT_XOR_TEST x y z)
  -----xxxxxyyyyyzzzzz010110 (BIT_DR x y z)
-----xxxxxyyyyzzzzz010111 (BIT_OR_TEST x y z)
               -----xxxxxyyyyyzzzzz011001 (SHIFT_LEFT_TEST x y z)
  -----xxxxxyyyyyzzzzz011010 (ROTATE_LEFT x y z)
               ------xxxxxyyyyyzzzzz011011 (ROTATE_LEFT_TEST x y z)
                ------xxxxxyyyyyzzzzz011100 (UNS_SHIFT_RIGHT x y z)
                 -----xxxxxyyyyyzzzzz011101 (UNS_SHIFT_RIGHT_TEST x y z)
                     -----xxxxxyyyyyzzzzz011110 (INT_SHIFT_RIGHT x y z)
            ------(INT_SHIFT_RIGHT_TEST x y z)
  -----xxxxxyyyyyaaaaa100000 (PTR_SET_AC x y ac)
              -----(UNS_ADD_CARRY_TEST x y z)
                ----- (TRAP_RESTORE tr y)
-----xxxxxyyyyyzzzzz100011 (UNS_SUB_CARRY_TEST x y z)
             -----xxxxxyyyyyzzzzzz100101 (IHT_ADD_TEST x y z)
               -----xxxxxyyyyyzzzzz100110 (INT_SUB x y z)
             -----xxxxyyyyyyzzzzz100111 (INT_SUB_TEST x y z)
    -----xxxx000ppooo0110ooo (SSH_DISP x offset)
   ------BENERRENDE COOOO1110000 (SKIP mask cn offset)
     -----xxxx00000011110110ttt (TARGET_SAVE x tn)
                       -- mranamancc111111110ttt (JUAP mask on tn)
                ----- (LEVEL ENTER lev)
                  ----- 000000011111110110ttt (LEVEL_RTN lev tn)
                -----Bank cn offset)
                  ----- www.mamm.cc11111111ttt (JUMP_SELDOH mask cn tn)
```

#### A.5 MAC OPs

#### A.6 I OPs

# Appendix B: Processor State

The following table describes all of the state information maintained by a processor. The rows describe what state information is maintained and whether user, supervisor, and IPL privilege can directly read (abbreviated "r") or write (abbreviated "w"), that state. The asterisk ("\*") indicates that the state cannot be written, but can be indirectly modified. Kernel level has the same capabilities as supervisor level. An unfilled entry is the same as the one above it.

|        |          |             |          |        |      | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                |
|--------|----------|-------------|----------|--------|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|
| per    | LEV_USER | LEV_SUPER   | LEV_IPL  | number | bits | what                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | reference      |
| stream | Iw       | rw          |          |        |      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                |
| ou can | 1."      | 1 W         | ľW       | 1      | 64   | stream status word                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | §2.1           |
|        | ŀ        |             |          | 1      | 64   | exception register                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | §9.1           |
|        |          |             |          | 1      | 64   | result code register                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | §9.1           |
|        |          |             |          | 31     | 64   | general purpose registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | §1.3           |
|        |          |             |          | 8      | 32   | target registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | §2.2           |
|        |          |             |          | 1      | 16   | instruction count register                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | §10            |
|        | -        | rw          | IM       | 1      | 4    | protection domain                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | §8.2           |
|        | •        | <u>-</u>    | <u> </u> | 1      | 2    | stream level                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | §8.1           |
| pd     | r*       | r*          | r*       | 1      | 56   | in at mark in the state of the | 0.0            |
|        |          |             | -        | 1      | 56   | instruction issue counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | §10            |
| İ      |          |             |          | 1      | 56   | memory reference counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | §10            |
| ŀ      |          |             |          | 1      | 56   | stream counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | §10            |
| 1      |          |             |          | 4      | 64   | concurrency counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | §10            |
| ]      | •        | rw          | rw       | 1      | 64   | selectable event counters                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | §10.2          |
|        |          |             |          | 1      | 64   | data state descriptor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | §6.2           |
|        |          |             |          | 16,384 |      | program state descriptor                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | §7.1           |
|        |          |             |          | 8,192  | 64   | data address map entries                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | §6.2           |
|        | r*       | r*          | r*       | •      | 64   | program address map entries                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | §7             |
| 1      |          | •           | •        | 1      | 7    | stream reserved, SRES $_D$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | §2             |
|        |          | <del></del> |          | 1      | 7    | stream current, SCURD                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | §2             |
| proc   | rw       | rw          | rw       | 384    | 64   | trap registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | §9.2           |
| 1      |          |             |          | 512    | 64   | data control registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | §6.3           |
|        |          |             |          | 512    | 64   | data value registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | §6.3           |
|        | r        | r*          | r*       | 132    | 32   | program address TLB entries                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | §0.3<br>§7     |
|        |          |             |          | 1024   | 64   | data address TLB entries                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | §6.2           |
|        | r        | r           | rw       | 256    | 32   | reciprocal table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 30.2           |
|        |          |             |          | 256    | 32   | reciprocal square root table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                |
| •      | r        | r           | r        | 1      | 56   | phantom counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | §10.2          |
|        |          |             |          | 1      | 56   | ready counter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | §10.2<br>§10.2 |
|        |          |             |          | 1      | 64   | clock                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | §10.2<br>§10   |
|        |          |             |          | _      |      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 210            |

# Appendix C: GF(2) Addressing Matrices

## C.1 Scrambling Matrices

This is the GF(2) matrix used for address scrambling:

```
1011000011000110101
101001000111101010101
0\ 1\ 0\ 1\ 0\ 0\ 0\ 1\ 1\ 0\ 0\ 0\ 1\ 0\ 1\ 0\ 0\ 0\ 1
0100010010010001110
1000110100000101101
0110110111110111001
0\ 0\ 1\ 1\ 0\ 1\ 1\ 0\ 0\ 0\ 0\ 1\ 1\ 1\ 0\ 1\ 1\ 1\ 0
0\ 0\ 0\ 0\ 1\ 0.0\ 0\ 0\ 0\ 0\ 1\ 1\ 1\ 1\ 1\ 0\ 1
```

This is the GF(2) inverse matrix:

```
1000111101010101011
0011100010110011101
0\ 0\ 0\ 1\ 1\ 1\ 0\ 0\ 1\ 1\ 1\ 0\ 0\ 0\ 0\ 1\ 0\ 0\ 0
0\; 0\; 0\; 0\; 0\; 1\; 1\; 0\; 0\; 1\; 0\; 0\; 0\; 0\; 0\; 0\; 0\; 0\; 1\; 1
0000000001101101011
```

This is the data single-bit error syndrome table. If a syndrome is found in this table, then the bit 4 \* row + col is in error. If the syndrome is zero, there is no error. Otherwise, there is an uncorrectable error. Bits 71-64 are the ecc bits.

0x80 0x40 0x20 0x10  $0x08 \ 0x04 \ 0x02 \ 0x01$ 0xc6 0xel 0xe2 0xd1 0xc9 0xd2 0xe4 0xe8 0xd4 0xca 0xc5 0xc3 0x86 0xa1 0xa2 0x91 0x89 0x92 0xa4 0xa8 0x94 0x8a 0x85 0x83 0x8c 0xb0 0xa0 0x90 0x88 0x84 0x82 0x81 0x46 0x61 0x62 0x51 0x49 0x52 0x64 0x68 0x54 0x4a 0x45 0x43 0x4c 0x70 0x60 0x50 0x48 0x44 0x42 0x41 0x06 0x21 0x22 0x11 0x09 0x12 0x24 0x28  $0x14 \ 0x0a \ 0x05 \ 0x03$ 

This is the access state single-bit error syndrome table. If a syndrome is found in this table, then the bit row is in error. If the syndrome is zero, there is no error. Otherwise, there is an uncorrectable error. Bits 7-4 are the ecc bits. Bits 3-0 are the access state.

#### C.1 Scrambling Matrices

0x8

0x4

0x2

0x1

0x7

0xb

0xd

0xe

# Index

| $\mathbf{A}$                                     | BIT_NAND (operation) 64 BIT_NIMP_TEST (operation) 65 |
|--------------------------------------------------|------------------------------------------------------|
| A-operation 15, 17, 52, 147                      | BIT_NIMP (operation) 65                              |
| A-unit 15, 46, 50, 137                           | BIT_NOR_TEST (operation) 66                          |
| AccessState (struct) 23                          | BIT_NOR (operation) 66                               |
| access control 10, 23, 24, 29, 32, 55, 114, 121, | BIT_ODD_AND_TEST (operation) 67                      |
| 134, 138, 142, 154, 155, 157, 159, 160,          | BIT_ODD_AND (operation) 67                           |
| 161, 162, 212, 227                               | BIT_ODD_NIMP_TEST (operation) 67                     |
| access state 10, 23, 29, 33, 55, 144, 146, 154,  | BIT_ODD_NIMP (operation) 67                          |
| 155, 156, 157, 158, 204, 207, 243                | BIT_ODD_OR_TEST (operation) 67                       |
| active 12, 219, 228                              | BIT_ODD_OR (operation) 67                            |
| address 10, 12, 14, 23, 24, 25, 26, 28, 29, 30,  | BIT_ODD_XOR_TEST (operation) 67                      |
| 31, 33, 35, 36, 37, 38, 39, 40, 41, 44,          | BIT_ODD_XOR (operation) 67                           |
| 45, 48, 55, 56, 80, 81, 83, 138, 139,            | BIT_OR_TEST (operation) 68                           |
| 140, 153, 194, 204, 207, 212, 230, 242           | BIT_OR (operation) 68                                |
| address translation 23, 24, 26, 33, 200, 206     | BIT_PACK (operation) 69                              |
| align 26, 154, 155, 194                          | BIT_RIGHT_ONES_TEST (operation) 70                   |
| allsig (field in ProgramStateDescriptor) 41      | BIT_RIGHT_ONES (operation) 70                        |
| Aop (field in Operation) 15                      | BIT_RIGHT_ZEROS_TEST (operation) 70                  |
| A_float_result_code (field in ResultCode) 46     | BIT_RIGHT_ZEROS (operation) 70                       |
| A_float_result_reg (field in ResultCode) 46      | BIT_TALLY_TEST (operation) 71                        |
|                                                  | BIT_TALLY (operation) 71                             |
| В                                                | BIT_UNPACK_1 (operation) 72                          |
|                                                  | BIT_UNPACK_2 (operation) 72                          |
| bad_connect (field in IOPStatusWord) 194, 214,   | BIT_UNPACK_3 (operation) 72                          |
| 224                                              | BIT_XNOR_TEST (operation) 73                         |
| base (field in DataStateDescriptor) 9, 10, 28,   | BIT_XNOR (operation) 73                              |
| 30, 39, 41, 53, 89, 119, 125, 178, 198,          | BIT_XOR_TEST (operation) 74                          |
| 200, 206                                         | BIT_XOR (operation) 74                               |
| BIT_AND_TEST (operation) 58                      | branch 11, 13, 15, 16, 17, 48, 131, 152, 153,        |
| BIT_AND (operation) 58                           | 169                                                  |
| BIT_IMP_TEST (operation) 59                      | BREAK (operation) 75                                 |
| BIT_IMP (operation) 59                           | byte_offset (field in DataAddress) 26                |
| BIT_LEFT_ONES_TEST (operation) 60                |                                                      |
| BIT_LEFT_ONES (operation) 60                     | byte 9, 23, 24, 25, 26, 30, 31, 33, 62, 116, 138,    |
| BIT_LEFT_ZEROS_TEST (operation) 60               | 154, 155, 159, 175, 194, 204, 207, 212,              |
| BIT_LEFT_ZEROS (operation) 60                    | 222                                                  |
| BIT_MASK (operation) 61                          | C                                                    |
| BIT_MAT_OR (operation) 62                        | Č                                                    |
| BIT_MAT_TRANSPOSE (operation) 62                 | C-operation 15, 17, 52                               |
| BIT_MAT_XOR (operation) 62                       | C-unit 15, 46, 50, 52, 137                           |
| BIT_MERGE_TEST (operation) 63                    | carry bit 17, 20                                     |
| BIT_MERGE (operation) 63                         | cc_O (field in StreamStatusWord) 12                  |
| BIT_NAND_TEST (operation) 64                     | cc_1 (field in StreamStatusWord) 12                  |
| ( 1 ===== ) 0 .                                  | ("Cid III Sticamstatus Wolu) 12                      |

cc\_2 (field in StreamStatusWord) 12 condition code 11, 12, 17, 19, 20, 55, 90, 92, cc\_3 (field in StreamStatusWord) 12 93. 96. 97. 98. 99. 103, 110, 111, 128, chop 86, 109, 171, 190 129, 131, 135, 136, 152 CLOCK (operation) 51, 76, 137 condition mask 17, 131, 135, 136, 152 CNT\_A\_NOP (enum in CountSource) 50, 137 CondMask (enum) 17, 18, 19, 131, 135, 136, CNT\_CONCURRENCY (enum in CountSource) 152 COND\_NEG\_C (enum in CondCode) 17 CNT\_CREATE (enum in CountSource) 51, 166 COND\_NEG\_NC (enum in CondCode) 17 CNT\_C\_NOP (enum in CountSource) 50, 137 COND\_OVFNAN\_C (enum in CondCode) 17 CNT\_FLOAT\_ADD (enum in CountSource) 50, COND\_OVFNAN\_NC (enum in CondCode) 17 COND\_POS\_C (enum in CondCode) 17 87, 88, 90, 100, 106, 107 CNT\_FLOAT\_DIV (enum in CountSource) 50, COND\_POS\_NC (enum in CondCode) 17 COND\_ZERO\_C (enum in CondCode) 17 CNT\_FLOAT\_MUL (enum in CountSource) 50, COND\_ZERO\_NC (enum in CondCode) 17 connect\_lost (field in IOPStatusWord) 194, 214, 88, 100, 107 CNT\_FLOAT\_SQRT (enum in CountSource) 215, 222, 225, 226, 227 Cop (field in Operation) 15 50, 105 counter 10, 38, 39, 40, 49, 50, 51, 77, 163, 165 CNT\_FLOAT\_TOTAL (enum in CountSource) CountSource (enum) 49, 50, 51, 77 50, 87, 88, 90, 91, 92, 93, 96, 97, 98, COUNT\_CONCURRENCY (operation) 49, 77 99. 100, 103, 104. 105, 106, 107 count\_disable (field in StreamStatusWord) 13, CNT\_INT\_FETCH\_ADD (enum in CountSource) 50, 114 COUNT\_EVENTS (operation) 50, 77 CNTISSUES (enum in CountSource) 51 COUNT\_ISSUES (operation) 49, 77 CNT\_JUMP\_EXPECTED (enum in CountSource) COUNT\_MEMREFS (operation) 49, 77 50, 131, 152 CNT\_JUMP\_UNEXPECTED (enum in CountSource) COUNT\_READY (operation) 79 COUNT\_PHANTOMS (operation) 79 COUNT\_SELECT\_RESTORE (operation) 49, CNT\_LEVEL (enum in CountSource) 50, 132 77, 78 CNTLOAD (enum in CountSource) 50, 116, COUNT\_SELECT\_SAVE (operation) 49, 78 117, 118, 133, 134, 144, 154, 175, 176, COUNT\_STREAMS (operation) 49, 77 177 create (field in ExceptionRegister) 44, 45, 83, CNT\_MEMREFS (enum in CountSource) 51 166 CNT\_MEM\_RETRY (enum in CountSource) create exception 44, 83, 166 50 cv (condition code) 12, 17, 131, 135, 136, 149, CNT\_M\_NOP (enum in CountSource) 50, 137 152, 153, 172 CNT\_QUIT (enum in CountSource) 51, 167 C\_float\_result\_code (field in ResultCode) 46 CNT\_STORE (enum in CountSource) 50, 146, C\_float\_result\_reg (field in ResultCode) 46 155, 157, 158, 159, 160, 161, 162 CNT\_STREAMS (enum in CountSource) 51 D CNT\_TARGET (enum in CountSource) 50, 169 CNT\_TRANSFER\_TOTAL (enum in CountSource) DataAddress (struct) 10, 24, 25, 26 50, 131, 152 DataAddrUns (a base type) 10, 24, 31 CNT\_TRAP (enum in CountSource) 48, 51 DataControlDescriptor (struct) 31 code generator 15 DataFrame (a base type) 10, 26, 39 complex 56 DataMapEntry (struct) 26 concurrency counter 49, 77, 241 DataResultCode (enum) 46, 47 CondCode (enum) 12, 17 DataSegment (a base type) 10, 26, 39

DataStateDescriptor (struct) 39 data trap 23, 24, 31, 43, 45, 47, 157, 194, 204 data\_alignment (field in ExceptionRegister) 13. data value register 31, 32, 241 44, 45, 81, 114, 116, 117, 118, 121, debugger 49, 75 133. 134. 144, 146. 156. 157, 158. 159. denormalize 21, 46, 86, 102, 119, 126, 188 160, 161, 162, 175, 176, 177 depend 15, 16 data\_blocked (field in ExceptionRegister) 44, dest\_reg (field in DataControlDescriptor) 31. 45, 81, 114, 116, 117, 118, 121, 133, 32, 81 134. 144. 146, 156, 157, 158. 159, 160. distribution\_enable (field in DataMapEntry) 161, 162, 175, 176, 177 26, 29, 36 data\_frame\_offset (field in DataAddress) 26 distributor 10, 26, 29, 36 data\_hw\_error (field in ExceptionRegister) 44, DomainDataAddress (struct) 28 45, 114, 116, 117, 118, 121, 133, 134, DomainDataTLBAddress (struct) 28 144, 146, 154, 155, 156, 157, 158, 159, DomainProgramAddress (struct) 35 160, 161, 162, 175, 176, 177 DOMAIN\_ENTER (operation) 83 DATA\_MAP\_FLUSH\_ANY (operation) 28, 32, DOMAIN\_IDENTIFIER\_SAVE (operation) 38, 80 DATA\_MAP\_FLUSH (operation) 28, 32, 80 DOMAIN\_LEAVE (operation) 38, 83 DATA\_OPA\_SAVE (operation) 31, 32, 47, 81 domain\_signal\_trap\_disable (field in StreamSta-.. DATA\_OPD\_SAVE (operation) 31, 32, 47, 81 tusWord) 12 DATA\_OP\_REDO (operation) 31, 47, 81 domain\_signal (field in ExceptionRegister) 44, data\_prot (field in ExceptionRegister) 39, 44, 45, 80, 81, 114, 116, 117, 118, 121, domain 25, 26, 28, 29, 35, 38, 39, 40, 41, 49, 133, 134, 138, 144, 146, 154, 155, 156, 80, 82, 83, 138, 140, 141, 164, 230 157, 158, 159, 160, 161, 162, 175, 176, domain signal 12, 38, 41, 43, 45 177 doubled precision 21, 88, 100, 185, 186 data\_segment\_number (field in DataAddress) dr0 (field in ResultCode) 46, 147 dr7 (field in ResultCode) 46 data\_segment\_offset (field in DataAddress) 26 DR\_DATA\_ALIGNMENT (enum in DataRe-DATA\_STATE\_RESTORE (operation) 32, 39, sultCode) 47 80, 82 DR\_DATA\_TRAP01 (enum in DataResultCode) data address unimplemented 30, 47 data address 26, 28, 29, 35, 80, 198, 241 DR\_DATA\_TRAPO (enum in DataResultCode) data alignment 25, 26, 32, 47 data control register 31, 32, 241 DR\_DATA\_TRAP1 (enum in DataResultCode) data map limit 25, 26, 32, 39, 47, 80 data map 10, 25, 26, 28, 29, 30, 36, 38, 39, 55, DR\_LATENCY\_LIMIT (enum in DataResult-80, 198, 200, 206 Code) 47 data memory retry exception 32, 39, 43, 45, DR\_MAP\_LIMIT (enum in DataResultCode) data memory 10, 23, 24, 25, 26, 28, 29, 30, 32, DR\_NONE (enum in DataResultCode) 47 33, 36, 39, 41, 43, 45, 47, 50, 80, 81 DR\_PROTECTION\_LEVEL (enum in DataRedata protection 25, 29, 30, 32, 43, 45, 47, 138 sultCode) 47 data result 44, 46, 47, 81, 163 DR\_RETRY\_LIMIT (enum in DataResultCode) data segment number 10, 26, 39 data segment 10, 25, 26, 29, 32, 47 DR\_SEGMENT\_LIMIT (enum in DataResultdata state descriptor 25, 26, 38, 39, 82, 241 Code) 47 data trap bit 22, 25, 154, 155 DR\_UNCORRECTABLE\_ERROR (enum in DataRedata trap disable 25, 157 sultCode) 47

DR\_UNIMPLEMENTED\_ADDRESS (enum in Ex\_Poison (enum in Exception) 43 DataResultCode) 47 Ex\_Privileged (enum in Exception) 43 DR\_UNIMPLEMENTED\_OP (enum in DataRe-Ex\_Prog\_HW\_Error (enum in Exception) 43 sultCode) 39, 47, 81 Ex\_Prog\_Prot (enum in Exception) 43 d (protection domain) 38, 77, 78, 83, 84, 164, 166 F  $\mathbf{E}$ fe\_control (field in DataControlDescriptor) 23, 24, 31, 116, 117, 118, 133, 134, 154, effective address 24, 25, 26 157, 175, 176, 177 endian 23 FE\_FUTURE (enum in FullEmptyControl) 24, enumeration member 9 25, 154, 157 enumeration name 9 FE\_NORMAL (enum in FullEmptyControl) 24, EventSelect (struct) 49 25, 32, 116, 117, 118, 133, 157, 175, event counter 49, 50, 53, 77, 78, 241 176, 177 ExceptionRegister (struct) 45, 46 FE\_SYNC (enum in FullEmptyControl) 24, 25, EXCEPTION\_RESTORE (operation) 43, 85 154, 157 EXCEPTION\_SAVE (operation) 43, 85 field name 9 exception 10, 11, 12, 13, 14, 15, 20, 21, 23, Float32 (struct) 20 25, 26, 29, 30, 31, 32, 36, 38, 41, 43, Float64 (struct) 20, 92, 103, 124 44, 45, 46, 47, 48, 49, 51, 53, 55, 75, FloatResultCode (enum) 46 81, 85, 86, 91, 92, 100, 102, 105, 109, FloatSelect (enum) 19 112, 113, 123, 130, 132, 138, 140, 145, FLOAT\_ADD\_MUL (operation) 50, 88 146, 149, 157, 165, 170, 171, 173, 174, FLOAT\_ADD (operation) 50, 52, 87 194, 198, 199, 200, 201, 205, 208, 210, FLOAT\_APPROX\_RESTORE (operation) 89 211, 212, 213, 214, 215, 217, 218, 219, FLOAT\_CEIL (operation) 21, 86 220, 221, 222, 223, 224, 225, 226, 227, FLOAT\_CHOP (operation) 86 228, 229, 230 FLOAT\_CMP\_TEST (operation) 20, 90 Exception (enum) 43, 44, 45, 84, 194 FLOAT\_DIV\_APPROX (operation) 92 exception (exception status register) 85, 164, FLOAT\_DIV\_ERROR (operation) 93 166 FLOAT\_DIV (operation) 50, 91 exception register 11, 32, 43, 44, 47, 85, 166, float\_extension (field in ExceptionRegister) 21, 241 45, 46, 102, 112, 113, 123, 130, 173, exec\_level (field in ProgramMapEntry) 33, 36 174 exponent 20, 92, 93, 102, 103, 113, 119, 174 FLOAT\_FLOOR (operation) 86 Ex\_Create (enum in Exception) 43 float\_inexact\_trap\_disable (field in StreamSta-Ex\_Data\_Alignment (enum in Exception) 43 tusWord) 12 Ex\_Data\_Blocked (enum in Exception) 43 float\_inexact (field in ExceptionRegister) 45, Ex\_Data\_HW\_Error (enum in Exception) 43 46, 86, 87, 88, 91, 92, 94, 100, 104, Ex\_Data\_Prot (enum in Exception) 43 105, 106, 107, 108, 109, 143, 171 Ex\_Domain\_Signal (enum in Exception) 43 FLOAT\_INT (operation) 94 Ex\_Float\_Extension (enum in Exception) 43 float\_invalid\_trap\_disable (field in StreamSta-Ex\_Float\_Inexact (enum in Exception) 44 tusWord) 12 Ex\_Float\_Invalid (enum in Exception) 44 float\_invalid (field in ExceptionRegister) 20, Ex\_Float\_Overflow (enum in Exception) 44 21, 45, 46, 87, 88, 90, 92, 106, 107, Ex\_Float\_Underflow (enum in Exception) 44 109, 171 Ex\_Float\_Zero\_Divide (enum in Exception) 44 FLOAT\_ITER (operation) 91, 95, 188 Ex\_Instruction\_Count (enum in Exception) 43 FLOAT\_MAX\_TEST (operation) 96

FLOAT\_MAX (operation) 20.96 float point 9. 21. 87. 88. 90. 91. 92. 93. 95, 96, FLOAT\_MIN\_TEST (operation) 19. 20. 97 97. 98. 99, 100. 102. 103. 104, 105, FLOAT\_MIN (operation) 20.97 106, 107, 123, 124, 126, 143 FLOAT\_MMAX\_TEST (operation) 98 float underflow 12, 44, 45 FLOAT\_MMAX (operation) 98 float zero divide 12, 44, 45 FLOAT\_MMIN\_TEST (operation) 99 flop 180. 182 FLOAT\_MMIN (operation) 99 forced\_link (field in IOPStatusWord) 194, 204 FLOAT\_MUL\_LOWER (operation) 100 format 9, 20, 52, 101, 102, 198, 200, 206, 230 FLOAT\_NEAR (operation) 86 forward\_enable (field in AccessState) 23, 154, float\_overflow\_trap\_disable (field in StreamSta-157 tusWord) 12 forward 13, 23, 24, 25, 31, 49, 50, 116, 117, float\_overflow (field in ExceptionRegister) 45, 118, 133, 138, 144, 146, 152, 154, 155, 46, 87, 88, 91, 92, 100, 104, 106, 107, 156, 157, 158, 175, 176, 177, 194, 204, 143 212 FLOAT\_REAL (operation) 101. 143 forward bit 25 FLOAT\_RECIP\_APPROX\_TEST (operation) fraction (field in Float32) 20, 21, 22 frame\_offset (field in ProgL2Address) 35 FLOAT\_RECIP\_APPROX (operation) 46, 102, frame number 10, 33, 36 188 FR\_DQ (enum in FloatResultCode) 46 FLOAT\_RECIP\_ERROR (operation) 103, 188 FR\_DR (enum in FloatResultCode) 46 FLOAT\_ROUND (operation) 86 FR\_DZ (enum in FloatResultCode) 46 FLOAT\_RSQRT\_APPROX\_TEST (operation) FR\_FG (enum in FloatResultCode) 46 102 FR\_FX (enum in FloatResultCode) 46 FLOAT\_RSQRT\_APPROX (operation) 46, 102 FR\_IM (enum in FloatResultCode) 46 FLOAT\_RSQRT\_ERROR\_TEST (operation) 103 full/empty 23, 24, 25, 31, 154, 157, 204 FLOAT\_SCALB (operation) 104 FullEmptyControl (enum) 23, 24, 31 FLOAT\_SQRT\_APPROX\_TEST (operation) 92, full (field in AccessState) 23, 24, 25, 138, 144, 188 146, 154, 155, 156, 157, 158, 194, 212 FLOAT\_SQRT\_ERROR\_TEST (operation) 93 fwd\_disable (field in DataControlDescriptor) FLOAT\_SQRT (operation) 50, 105 23, 24, 31, 154, 157 FLOAT\_SUB\_MUL\_REV (operation) 100, 107 G FLOAT\_SUB\_MUL (operation) 107 FLOAT\_SUB (operation) 50, 106 general purpose register 55, 166, 241 float\_underflow\_trap\_disable (field in Streamglobal memory 30 StatusWord) 12 float\_underflow (field in ExceptionRegister) 45, Η 46, 88, 91, 100, 104, 107, 143 FLOAT\_UNS (operation) 108 halfword 9, 10, 23, 24, 117, 160, 176 float\_zero\_divide (field in ExceptionRegister) hardware\_trap\_disable (field in StreamStatus-21, 45, 46, 92 Word) 12 float\_zero\_div\_trap\_disable (field in StreamStahardware error 30, 43, 45 tusWord) 12 float 18, 19, 43, 45, 46, 50, 86, 94, 101, 108, Ī 109, 119, 143, 171, 188 float inexact 12, 44, 45 identifier 53, 84, 163, 166 float invalid operation 44, 45 idle 12, 198 float invalid 12, 45 IEEE754 floating point 9, 17, 20, 21, 86, 109, float overflow 12, 44, 45 171

| IF o /                                          |                                                 |
|-------------------------------------------------|-------------------------------------------------|
| IF_O (enum in CondMask) 18                      | INST_SEGMENT (operation) 198                    |
| IF_1 (enum in CondMask) 18                      | integer 9, 17, 18, 19, 46, 55, 86, 94, 95, 104. |
| IF 2 (enum in CondMask) 18                      | 109, 110, 111, 112, 113, 119, 120, 122,         |
| IF_3 (enum in CondMask) 18                      | 123, 124, 125, 128, 129, 130, 172, 173,         |
| IF_4 (enum in CondMask) 18                      | 174, 178, 190, 192                              |
| IF_5 (enum in CondMask) 18                      | interconnect_lost (field in IOPStatusWord) 194, |
| IF_6 (enum in CondMask) 18                      | 214, 224                                        |
| IF_7 (enum in CondMask) 18                      | IntSelect (enum) 19                             |
| IF_ALWAYS (enum in CondMask) 17                 | INT_ADD_IMM_TEST (operation) 111                |
| IF_CY (enum in CondMask) 18                     | INT_ADD_IMM (operation) 55, 60, 111             |
| IF_EQ (enum in CondMask) 17, 18                 | INT_ADD_MUL_TEST (operation) 112                |
| IF_FGE (enum in CondMask) 18                    | INT_ADD_MUL (operation) 112, 173                |
| IF_FGT (enum in CondMask) 18                    | INT ADD TEST (operation) 112, 173               |
| IF_FLE (enum in CondMask) 18                    | INT_ADD_TEST (operation) 110                    |
| IF_FLT (enum in CondMask) 18                    | INT_ADD (operation) 60, 110                     |
| IF_FUN (enum in CondMask) 18                    | INT_CEIL_TEST (operation) 109                   |
| IF_F (enum in CondMask) 18                      | INT_CEIL (operation) 109                        |
| IF_IGE (enum in CondMask) 18                    | INT_CHOP_TEST (operation) 109                   |
| IF_IGT (enum in CondMask) 18                    | INT_CHOP (operation) 21, 109                    |
| IF_ILE (enum in CondMask) 18                    | INT_DIV_CHOP_TEST (operation) 113               |
| IF_ILT (enum in CondMask) 18                    | INT_DIV_CHOP (operation) 113, 190               |
| IF_IMI (enum in CondMask) 18                    | INT_DIV_FLOOR_TEST (operation) 113              |
| IF JMZ (enum in CondMask) 18                    | INT_DIV_FLOOR (operation) 113, 190              |
| IF JOV (enum in CondMask) 18                    | INT_FETCH_ADD_AC_DISP (operation) 114           |
| IF JPL (enum in CondMask) 18                    | INT_FETCH_ADD_AC_INDEX (operation) 114          |
| IF JPZ (enum in CondMask) 18                    | INT_FETCH_ADD_DISP (operation) 114              |
| IF_NC (enum in CondMask) 18                     | INT_FETCH_ADD_INDEX (operation) 114             |
| IF_NEVER (enum in CondMask) 17                  | INT_FLOOR_TEST (operation) 109                  |
| IF_NE (enum in CondMask) 18                     | INT_FLOOR (operation) 109                       |
| IF_NZ (enum in CondMask) 18                     | INT_IMM (operation) 115, 137                    |
| IF-T (enum in CondMask) 18                      | INT_LOADB_AC_DISP (operation) 116               |
| IF_UGE (enum in CondMask) 18                    | INT_LOADB_AC_INDEX (operation) 116              |
| IF_UGT (enum in CondMask) 18                    | INT_LOADB_DISP (operation) 116                  |
| IF III F (onum in Conditions) 18                | INT_LOADB_INDEX (operation) 116                 |
| IF ULE (enum in CondMask) 18                    | INT_LOADB (operation) 31, 116                   |
| IF_ULT (enum in CondMask) 18                    | INT_LOADH_AC_DISP (operation) 117               |
| IF ZE (enum in CondMask) 18                     | INT_LOADH_AC_INDEX (operation) 117              |
| illegal_op_code (field in IOPStatusWord) 194    | INT_LOADH_DISP (operation) 117                  |
| instruction_count (field in ExceptionRegister)  | INT_LOADH_INDEX (operation) 117                 |
| 44, 45                                          | INT_LOADH (operation) 31, 117                   |
| instruction 10, 12, 13, 15, 16, 17, 24, 33, 35, | INT_LOADQ_AC_DISP (operation) 118               |
| 36, 40, 43, 44, 45, 47, 49, 51, 52, 53,         | INT_LOADQ_ACJNDEX (operation) 118               |
| 55. 85, 139, 147, 153, 164, 165, 166,           | INT_LOADQ_DISP (operation) 118                  |
| 169, 180, 182, 190, 192, 194, 195, 199,         | INT_LOADQ_INDEX (operation) 118                 |
| 201, 203, 204, 205, 207, 208, 213, 214,         | INT_LOADQ (operation) 31, 118                   |
| 216, 218, 220, 221, 222, 223, 225, 226,         | INT_LOGB_TEST (operation) 119                   |
| 229, 230. 233, 241                              | INTLOGB (operation) 119                         |
| instruction counter 13, 47, 49, 164, 165, 166   | INT_MAX_TEST (operation) 120                    |
| instruction issue counter 49, 77, 165, 241      | INT_MAX (operation) 17, 120                     |
|                                                 | •                                               |

INT\_MEM\_ADD\_AC\_DISP (operation) 121 KERNEL level 12, 38, 51, 164, 241 INT\_MEM\_ADD\_ACJNDEX (operation) 121 L INT\_MEM\_ADD\_DISP (operation) 121 INT\_MEM\_ADD\_INDEX (operation) 121 INT\_MIN\_TEST (operation) 122 LlAddress (struct) 36 L2Address (struct) 36, 37 INT\_MIN (operation) 122 INT\_NEAR\_TEST (operation) 109 la (field in Operation) 15 LEVEL\_ENTER (operation) 38, 50, 132 INT\_NEAR (operation) 109 LEVEL\_RTN (operation) 38, 47, 132, 169 INT\_RECIP\_APPROX (operation) 123 level 12, 15, 25, 26, 29, 32, 33, 36, 38, 39, 40, INT\_RECIP\_ERROR (operation) 124 41, 45, 47, 48, 50, 55, 80, 132, 138, INT\_RECIP\_SHIFT\_TEST (operation) 125 INT\_RECIP\_SHIFT (operation) 125 164, 198, 200, 206, 241 INT\_ROUND\_TEST (operation) 109 Level (enum) 26, 33, 38, 39, 40, 41, 138, 169 INT\_ROUND (operation) 109 level (stream level) 132, 138, 164, 166 LEV\_IPL (enum in Level) 38, 138, 241 INT\_RSQRT\_APPROX (operation) 126 LEV\_KERNEL (enum in Level) 38 INT\_SHIFT\_RIGHT\_TEST (operation) 127 LEV\_SUPER (enum in Level) 38, 48, 241 INT\_SHIFT\_RIGHT (operation) 127 INT\_SUB\_IMM\_TEST (operation) 129 LEV\_USER (enum in Level) 38, 138, 241 INT\_SUB\_IMM (operation) 129 limit\_error (field in IOPStatusWord) 194, 201, INT\_SUB\_MUL\_REV\_TEST (operation) 130 204, 207, 208, 212 INT\_SUB\_MUL\_REV (operation) 130 limit (field in DataStateDescriptor) 12, 25, 29, 30, 32, 33, 39, 41, 45, 47, 198, 200, INT\_SUB\_MUL\_TEST (operation) 130 206 INT\_SUB\_MUL (operation) 130 INT\_SUB\_TEST (operation) 128 line\_index (field in L1Address) 35, 36, 37 INT\_SUB (operation) 128 link\_error (field in IOPStatusWord) 194 invalid operation 21 LOAD\_AC\_DISP (operation) 133 IN\_ACCEPT (operation) 226 LOAD\_AC\_INDEX (operation) 133 LOAD\_DATA (operation) 204 IN\_DELAY (operation) 229 LOAD\_DISP (operation) 133 INJNTERCONNECT (operation) 228 IN\_LINK\_STORE (operation) 223 LOAD\_END\_PACKET (operation) 203, 222 IN\_LINK (operation) 223 LOAD\_ERR\_OFFSET (operation) 201 LOAD\_FE (operation) 134 INLISTEN (operation) 224 INLOOPBACK (operation) 215, 227 LOAD\_FLUSH (operation) 202 IN\_REJECT (operation) 225 LOAD\_IMAGE (operation) 204, 222 IOPStatusWord (struct) 194, 195 LOAD\_INDEX (operation) 133 IopStream (enum) 195 load\_level (field in DataMapEntry) 26, 29 LOAD\_LINK\_OUT (operation) 199, 215, 222 IOP\_IN (enum in IopStream) 195 IOPLOAD (enum in IopStream) 195 LOAD\_LINK (operation) 199 LOAD\_SEGMENT (operation) 200 IOP\_OUT (enum in IopStream) 195 LOAD (operation) 23, 24, 31, 49, 50, 60, 114, IOP\_STORE (enum in IopStream) 195 121, 133, 207, 222, 231 locality 29 local memory 139 JUMP\_OFTEN (operation) 16, 131 locked (memory cell) 23, 25, 26, 28, 155, 163, JUMP\_SELDOM (operation) 16, 131 198, 200, 206 JUMP (operation) 13, 14, 50, 55, 60, 131, 169 LOGICAL\_ALLONE\_TEST (operation) 135 LOGICAL\_ALLONE (operation) 135, 136 K LOGICAL\_ONE\_TEST (operation) 136

J

LOGICAL\_ONE (operation) 135, 136 NaN 17, 18, 20, 21, 22, 90, 93, 96, 97, 98, 99, logical\_unit (field in DataMapEntry) 26, 29 100, 101, 119, 143 logical unit number 26, 29, 30, 198 negative 10, 17, 22, 90, 92, 96, 97, 98, 99, 119, logical unit offset 26, 29, 33, 36 171, 190 lookahead\_disable (field in StreamStatusWord) nonstandard 17 NOP (operation) 50, 52, 137 lookahead 12, 13, 15, 16, 31, 43, 47, 52, 81, normalize 21 83, 131, 132, 147, 152, 163, 167, 180, no\_connection (field in IOPStatusWord) 194, 222 loopback (field in IOPStatusWord) 195, 207, 215, 216, 217, 227 0 M OPA\_DATA\_OPA\_SAVE (enum in RetryOp-M-operation 15, 47, 52 Code) 32 M-unit 15, 16, 23, 31, 32, 46, 50, 55, 81, 137, OPA\_DATA\_OPD\_SAVE (enum in RetryOp-163 Code) 32 MAC-operation 15, 16, 44 OPA\_DATA\_STATE\_RESTORE (enum in Retrymap base 33 OpCode) 32 map entry 25, 26, 28, 33, 35, 36, 241 OPA\_INT\_FETCH\_ADD (enum in RetryOpmap limit 25, 45, 80, 138, 140 Code) 31 matrix 29, 55, 62, 242 OPA\_INT\_LOADB (enum in RetryOpCode) MC-operation 15 memory\_type (field in DataMapEntry) 26, 30 OPAINTLOADH (enum in RetryOpCode) memory reference counter 49, 77, 241 31 memory retry 13, 25, 29, 30, 31, 39, 44, 47, OPAJNTLOADQ (enum in RetryOpCode) 49. 198 memory system 9, 30, 36 OPALOAD (enum in RetryOpCode) 31 memory unit 10, 29, 30 OPA\_MAP\_FLUSH\_ANY (enum in RetryOpmin\_dkill (field in DataStateDescriptor) 39, 80 Code) 32 min\_pkill (field in ProgramStateDescriptor) 40, OPA\_MAP\_FLUSH (enum in RetryOpCode) min\_psleep (field in ProgramStateDescriptor) OPA\_PROBE (enum in RetryOpCode) 31 OPA\_REG\_LOAD (enum in RetryOpCode) 31 Mop (field in Operation) 15 OPA\_SCRUB\_LOAD (enum in RetryOpCode) N 31 OPA\_STATE\_LOAD (enum in RetryOpCode) NaNResultCode (enum) 21, 22 NAN\_INF\_DIV\_INF (enum in NaNResultCode) OPA\_STATE\_LOCK (enum in RetryOpCode) NAN\_INF\_SUB\_INF (enum in NaNResultCode) OPA\_STATE\_STORE (enum in RetryOpCode) NAN\_SQRT\_NEG (enum in NaNResultCode) OPA\_STOREB (enum in RetryOpCode) 31 OPA\_STOREH (enum in RetryOpCode) 31 NAN ZERO DIV ZERO (enum in NaN Result-OPA\_STOREQ (enum in RetryOpCode) 31 Code) 22 OPA\_STORE\_EMPTY (enum in RetryOpCode) NAN\_ZERO\_MUL\_INF (enum in NaNResult-Code) 22 OPA\_STORE (enum in RetryOpCode) 31

P

C.1 Scrambling Matrices

OPA\_STREAM\_CATCH (enum in RetryOppc (field in IOPStatusWord) 13. 14, 33, 44, Code) 32 47, 83, 131, 132, 152, 164, 166, 169, OPA\_STREAM\_CREATE (enum in RetryOp-194, 195, 201, 208, 224, 231 Code) 32 pc (program counter) 15 OPA\_UNS\_LOADB (enum in RetryOpCode) pfl (field in ExceptionRegister) 46 pf31 (field in ExceptionRegister) 46 OPA\_UNS\_LOADH (enum in RetryOpCode) phantom counter 51, 79, 241 31 physical unit number 29, 30 OPA\_UNS\_LOADQ (enum in RetryOpCode) physical unit offset 29, 33 pointer 9, 10, 23, 24, 25, 55, 134, 138, 142, operating system 37, 38, 39, 43, 48, 167, 168 154, 155, 166 OperationAccessControl (struct) 24 Pointer (struct) 23, 24, 141, 143, 145 OPERATION\_1 (operation) 53 poison 11, 32, 43, 44, 45, 46, 51, 81, 144, 145, OPERATION\_2 (operation) 53 146, 149, 170 Operation (struct), 15, 24, 52, 53, 55, 57, 59. positive 13, 17, 90, 93, 96, 97, 98, 99, 119 61. 63, 65, 67, 69, 71, 73, 75, 77, 79, precedence 40, 56 81. 83, 85, 87, 89, 91, 93, 95, 97, 99, privilege 12, 15, 25, 29, 36, 38, 39, 40, 41, 43, 101, 103, 105, 107, 109, 111, 113, 115, 44, 45, 47, 48, 55, 75, 78, 80, 82, 83, 117, 119, 121, 123, 125, 127, 129, 131, 84, 132, 138, 139, 140, 141, 164, 167, 133, 135, 137, 139, 141, 143, 145, 147, 169, 241 149, 151, 153, 155, 157, 159, 161, 163, priv\_quit (field in ProgramStateDescriptor) 41, 165, 167, 169, 171, 173, 175, 177, 179, 167 194, 197, 199, 201, 203, 205, 207, 209, priv\_t0 (field in ProgramStateDescriptor) 14, 211, 213, 215, 217, 219, 221, 223, 225, 41, 48, 169 227, 229, 233, 235, 237, 239 ProbeControl (enum) 138 os\_field (field in IOPStatusWord) 194 PROBE\_DISP (operation) 138 OUT\_CANCEL (operation) 218 PROBE\_INDEX (operation) 138 OUT\_DELAY (operation) 220 ProgFrame (a base type) 10, 33, 41 OUT\_DISCONNECT (operation) 217, 227 ProgL2Address (struct) 35 ProgramAddress (struct) 10, 13. 33 OUT\_INTERCONNECT (operation) 219 ProgramAddrUns (a base type) 10, 13, 35 OUT\_LINK\_LOAD (operation) 213 ProgramMapEntry (struct) 33 OUT\_LINK (operation) 213 OUTLOOPBACK (operation) 215, 216, 227 ProgramStateDescriptor (struct) 40, 41 PROGRAM\_CACHE\_FLUSH\_ANY (operation) OUT\_LOOPMODE (operation) 216 OUT\_PACKET (operation) 194, 203, 222 37, 139 PROGRAM\_CACHE\_FLUSH\_L1 (operation) OUT\_RESET (operation) 221 OUT\_RING (operation) 214 37, 139 PROGRAM\_CACHE\_FLUSH (operation) 37, overflow/NaN 17, 58, 59, 60, 63, 64, 65, 66, 139 68, 70, 71, 73, 74, 90, 96, 97, 98, 99, PROGRAM\_MAP\_FLUSH\_ANY (operation) 35, 109, 112, 119, 120, 122, 127, 130, 148, 140 149, 150, 151, 168, 171, 173, 179 PROGRAM\_MAP\_FLUSH (operation) 35, 140 PROGRAM\_STATE\_RESTORE (operation) 40, program address 11, 13, 14, 33, 35, 36, 38, 44, packet\_end (field in IOPStatusWord) 194, 208, 140, 241 212 program counter (PC) 10, 11, 12, 13, 14, 15, PageNumber (a base type) 10, 33, 41 36, 169, 180, 194

program map limit 33, 41 Reg (a base type) 10. 31, 46 program map 10, 33, 35, 36, 38, 41, 132, 140 Resource (enum) 26, 30, 39, 49, 51, 164, 165. program memory 10, 33, 35, 36, 37 167 program protection 33. 36. 43, 45. 132 restop (field in DataControlDescriptor) 31 program segment 198 RESULTCODE\_SAVE (operation) 147 program state descriptor 33, 38, 39, 40, 48, ResultCode (struct) 46 141, 167, 169, 241 result code (result code register) 147, 164, 166 ProgTlbAddr (struct) 35 result code register 11, 21, 43, 46, 47, 81, 85, prog\_frame (field in ProgramMapEntry) 33, 147, 166, 241 result code 21, 22, 46, 47, 81 prog\_hw\_error (field in ExceptionRegister) 44, RES\_DMEM (enum in Resource) 30 RES\_IOM (enum in Resource) 30 prog\_page\_number (field in ProgramAddress) RES\_IOP (enum in Resource) 30 RetryOpCode (enum) 31, 32 prog\_page\_offset (field in ProgramAddress) 33 retry\_limit (field in DataStateDescriptor) 39 prog\_prot (field in ExceptionRegister) 44, 45, RND\_CEIL (enum in RoundMode) 21 48, 132 RND\_CHOP (enum in RoundMode) 21 protection domain 12, 14, 26, 33, 38, 39, 40, RND\_FLOOR (enum in RoundMode) 21 41, 48, 49, 55, 77, 82, 83, 84, 141, 163, RND\_NEAR (enum in RoundMode) 21 165, 166, 167, 168. 241 ROTATE\_LEFT\_TEST (operation) 148 PTR\_SET\_AC (operation) 142 ROTATELEFT (operation) 148 p\_limit\_error (field in IOPStatusWord) 194, 198 ROTATE\_RIGHT\_TEST (operation) 148 P\_MODIFY (enum in ProbeControl) 138 ROTATE\_RIGHT (operation) 148 -P\_READ (enum in ProbeControl) 138 rounding mode 13, 21, 86, 94, 108, 109, 171, p\_uncorrectable\_error (field in IOPStatusWord) 185 194, 198 RoundMode (enum) 13, 21 p\_unimplemented\_address (field in IOPStatusround\_mode (field in StreamStatusWord) 13, Word) 194, 198 round 21, 86, 109, 113, 125, 171, 174, 178, 185, Q 188, 218 round to nearest 21, 92, 93, 95, 103, 124, 185 quarterword 9, 23, 24, 118, 161, 177 S R scramble 26, 29, 33, 36 ready counter 51, 79, 222, 241 scur (streams in use) 39, 83, 163, 166, 167, 241 REAL\_FLOAT (operation) 143 SegmentOffset (a base type) 10, 26 reciprocal 89, 91, 95, 102, 103, 105, 123, 126, segment\_base (field in DataMapEntry) 26, 29 188, 190, 191, 241 segment\_limit (field in DataMapEntry) 26, 29 REG\_LOAD\_AC\_DISP (operation) 144 segment\_offset (field in DomainDataAddress) REG\_LOAD\_AC\_INDEX (operation) 144 REG\_LOAD\_DISP (operation) 144 segment base 26, 29 REGLOADINDEX (operation) 144 segment limit 25, 26, 29, 45 REG\_MOVE (operation) 32, 60, 145 SELECT\_FLOAT\_TEST (operation) 149 REG\_STORE\_AC\_DISP (operation) 146 SELECT\_FLOAT (operation) 19, 149 REG\_STORE\_AC\_INDEX (operation) 146 SELECT\_INT\_TEST (operation) 149 REG\_STORE\_DISP (operation) 146 SELECT\_INT (operation) 19, 149 REG\_STORE\_INDEX (operation) 146 sel\_0 (field in EventSelect) 49

| sel_1 (field in EventSelect) 49                   | ssw_override (field in StreamStatusWord) 13,   |
|---------------------------------------------------|------------------------------------------------|
| sel 2 (field in EventSelect) 49                   | 25, 47, 132                                    |
| sel_3 (field in EventSelect) 49                   | SSW_RESTORE (operation) 153                    |
| SEL_CY (enum in IntSelect) 19, 149                | ssw (stream status word) 10, 11, 12, 14, 16,   |
| SEL_EQ (enum in IntSelect) 19, 149                | 21. 25, 26. 32, 43. 44. 47, 49, 83, 85,        |
| SEL_FGE (enum in FloatSelect) 19, 149             | 86, 94, 108, 109, 131, 132, 152, 153,          |
|                                                   |                                                |
| SEL_FGT (enum in FloatSelect) 19, 149             | 164, 166, 169, 171                             |
| SEL_FLE (enum in FloatSelect) 19, 149             | stall (field in DataMapEntry) 26, 29, 48       |
| SEL_FLT (enum in FloatSelect) 19, 149             | standard 16, 20, 21                            |
| SEL_FUN (enum in FloatSelect) 19, 149             | STATE_LOAD_DISP (operation) 154                |
| SELLIGE (enum in IntSelect) 19, 149               | STATELOAD_INDEX (operation) 154                |
| SELJGT (enum in IntSelect) 19, 149                | STATE_LOCK_AC_DISP (operation) 155             |
| SEL_IPL (enum in IntSelect) 19, 149               | STATE_LOCK_AC_INDEX (operation) 155            |
| SEL_IPZ (enum in IntSelect) 19, 149               | STATE_LOCK_DISP (operation) 155                |
| SEL_UGE (enum in IntSelect) 19, 149               | STATE_LOCK_INDEX (operation) 155               |
| SEL_UGT (enum in IntSelect) 19, 149               | STATE_SCRUB_DISP (operation) 156               |
|                                                   | STATE_SCRUB_INDEX (operation) 156              |
| set_index (field in DomainDataTLBAddress) 28      | STATE_STORE_AC_DISP (operation) 157            |
|                                                   | STATE_STORE_AC_INDEX (operation) 157           |
| set_number (field in DomainDataAddress) 28,       | STATE_STORE_DISP (operation) 157               |
| 35, 36, 37                                        | STATE_STORE_ERROR_DISP (operation) 158         |
| SHIFT_LEFT_IMM_TEST (operation) 150               |                                                |
| SHIFT LEFT IMM (operation) 150                    | STATE_STORE_ERROR_INDEX (operation)            |
| SHIFT_LEFT_TEST (operation) 150                   | 158                                            |
| SHIFT_LEFT (operation) 150                        | STATE_STORE_INDEX (operation) 157              |
| SHIFT_PAIR_LEFT_TEST (operation) 151              | status_link (field in IOPStatusWord) 194, 201, |
| SHIFT_PAIR_LEFT (operation) 151                   | 208, 224                                       |
| SHIFT_PAIR_RIGHT_TEST (operation) 151             | STOREB_AC_DISP (operation) 159                 |
| SHIFT_PAIR_RIGHT (operation) 151                  | STOREB_AC_INDEX (operation) 159                |
| significand 20, 143                               | STOREB_DISP (operation) 159                    |
| sign (field in Float32) 17, 20, 88, 93, 107, 116, | STOREB_INDEX (operation) 159                   |
| 117, 118, 190                                     | STOREB (operation) 31, 159                     |
| sign bit 20. 56, 127                              | STOREH_AC_DISP (operation) 160                 |
|                                                   | STOREH_AC_INDEX (operation) 160                |
| SKIP_OFTEN (operation) 16, 152                    | STOREH_DISP (operation) 160                    |
| SKIP_SELDOM (operation) 16, 152                   | STOREH_INDEX (operation) 160                   |
| SKIP (operation) 13, 50, 152                      | STOREH (operation) 31, 160                     |
| sleep 40                                          | STOREQ_AC_DISP (operation) 161                 |
| slim (field in ProgramStateDescriptor) 41         | STOREQ_AC_INDEX (operation) 161                |
| slim (stream count limit) 39, 41, 168             | STOREQ_DISP (operation) 161                    |
| SpecialFloat64 (struct) 20, 91, 92, 95, 102,      | STOREQ_INDEX (operation) 161                   |
| 103, 105, 113, 123, 124, 126, 174                 | STOREQ (operation) 31, 161                     |
| spec_load_enable (field in StreamStatusWord)      | STORE_AC_DISP (operation) 162                  |
| 13, 32                                            |                                                |
| square root 20, 56, 89, 92, 93, 95, 103, 105,     | STORE ACLINDEX (operation) 162                 |
| 126, 188, 241                                     | STORE DATA (operation) 210, 212, 223           |
| sres (streams reserved) 39, 49, 83, 163, 166,     | STORE DISP (operation) 162                     |
| 167, 168, 241                                     | STORE_END_PACKET (operation) 210               |
| SSW_DISP (operation) 153                          | STORE_END_SEGMENT (operation) 211              |
| community (obergrion) 193                         | STORE_ERR_OFFSET (operation) 208               |

```
STORE_FLUSH (operation) 209
                                                   synchronization 25
 STORE_IMAGE (operation) 210, 212, 223
                                                   system_trap_disable (field in StreamStatusWord)
 STORE_INDEX (operation) 162
                                                           12 .
 store_level (field in DataMapEntry) 26, 29
                                                   system trap 12
 STORE_LINK_IN (operation) 205, 224, 227,
                                                   S (total stream count) 168
 STORE_LINK (operation) 205
                                                T
 STORE_REPLICATE (operation) 207, 223
 STORE_SEGMENT (operation) 206
                                                  tag (field in DomainDataAddress) 28, 35, 36,
 STORE (operation) 23, 24, 31, 49, 50, 55, 114,
                                                          37, 49, 50, 77
         121, 162, 207, 224
                                                  TARGET_DISP (operation) 60, 169
StreamStatusWord (struct) 12, 13, 169
                                                  TARGET_INDEX (operation) 169
STREAM_CATCH (operation) 32, 164
                                                  TARGET_RESTORE (operation) 153, 169
STREAM_COUNT_INST_RESTORE (opera-
                                                  TARGET_SAVE (operation) 50, 169
        tion) 49, 165
                                                  target register T0 14, 41, 44, 47, 48, 164, 166
STREAM_COUNT_INST (operation) 49, 165
                                                  target register 11, 14, 16, 33, 50, 131, 153, 166,
STREAM_CREATE_IMM (operation) 166
                                                          169, 241
STREAM_CUR_SAVE (operation) 39, 163
                                                  timeout (field in IOPStatusWord) 194, 214,
STREAM_IDENTIFIER_SAVE (operation) 163
                                                          215, 217, 218, 219, 220, 221, 222, 224,
Stream_identity (field in IOPStatusWord) 195
                                                          226, 227, 228, 229, 231
STREAM_LOOKAHEAD_SAVE (operation) 163
                                                  tn (target register) 131
STREAM_QUIT_PRESERVE (operation) 167
                                                  trap0_disable (field in DataControlDescriptor)
STREAM_QUIT (operation) 12, 15, 39, 40,
                                                          24. 31
        41, 48, 51, 166, 167
                                                  trap0_enable (field in AccessState) 23, 154,
STREAM_RESERVE_TEST (operation) 168
                                                          157
STREAM_RESERVE_UPTO_TEST (operation)
                                                  trap0_load_disable (field in Pointer) 23, 154,
        168
STREAM_RESERVE_UPTO (operation) 168
                                                  trap0_store_disable (field in Pointer) 23, 154,
STREAM_RESERVE (operation) 12, 39, 166,
                                                          157
                                                  trap1_disable (field in DataControlDescriptor)
STREAM_RES_SAVE (operation) 39, 163
                                                          24, 31
stream 10, 11, 12, 14, 15, 17, 23, 25, 29, 31,
                                                  trapl_enable (field in AccessState) 23, 154,
        33, 36, 38, 39, 40, 41, 43, 44, 45, 47,
                                                          157
        48, 49, 51, 55, 77, 79, 81, 83, 132, 163,
                                                  trapl_load_disable (field in Pointer) 23, 154,
        165, 166, 167, 168, 182, 194, 195, 198,
        199, 200, 201, 202, 203, 204, 205, 206,
                                                  trapl_store_disable (field in Pointer) 23, 154,
        207, 208, 209, 210, 212, 213, 214, 215,
                                                          157
        216, 217, 222, 223, 224, 226, 227, 230,
                                                  TRAP_RESTORE (operation) 32, 48, 170
        231, 241
                                                  TRAP_SAVE (operation) 48, 170
stream counter 49, 77, 241
                                                  trap 11, 12, 13, 14, 15, 21, 25, 29, 40, 43, 44,
stream create exception 43, 45
                                                          46. 47, 48, 51, 81, 85, 138, 144, 146,
stream level 45, 241
                                                          153, 154, 156, 157, 158, 166, 188, 212
stream number 163
                                                  trap handler 11, 14, 21, 22, 31, 41, 43, 44, 47,
stream status word (ssw) 10, 11, 12, 33, 194,
                                                          48, 81, 166, 170, 188
        241
                                                 trap mask 11, 12, 43, 47, 85, 131
structure name 9
                                                  trap register TRO 48
subblock_index (field in L2Address) 35, 37
                                                 trap register 11, 48, 166, 167, 170, 241
supervisor level 12, 29, 38, 78, 241
```

U unaligned\_data\_enable (field in StreamStatus-Word) 13, 25, 26 unaligned data 13, 43, 45 uncorrectable\_error (field in IOPStatusWord) 194, 201, 204, 207, 208, 210, 211, 212, 222, 226 uncorrectable program memory error 43, 45 unimplemented\_address (field in IOPStatus-Word) 194, 201, 204, 207, 208, 212 unimplemented operation 47 unsigned integer 9, 15, 20, 55, 108, 110, 133, 134, 168, 171, 173, 174, 190 UNS\_ADD\_CARRY\_TEST (operation) 172 UNS\_ADD\_MUL\_UPPER\_TEST (operation) 173 UNS\_ADD\_MUL\_UPPER (operation) 173 UNS\_CEIL\_TEST (operation) 171 UNS\_CEIL (operation) 171 UNS\_CHOP\_TEST (operation) 171 UNS\_CHOP (operation) 171 UNS\_DIV\_TEST (operation) 174 UNS\_DIV (operation) 174 UNS\_FLOOR\_TEST (operation) 171 UNS\_FLOOR (operation) 21, 171 UNS\_LOADB\_AC\_DISP (operation) 175 UNS\_LOADB\_AC\_INDEX (operation) 175 UNS\_LOADB\_DISP (operation) 175 UNS\_LOADB\_INDEX (operation) 175 UNS\_LOADB (operation) 31, 175 UNS\_LOADH\_AC\_DISP (operation) 176 UNS\_LOADH\_AC\_INDEX (operation) 176 UNS\_LOADH\_DISP (operation) 176 UNS\_LOADH\_INDEX (operation) 176 UNS\_LOADH (operation) 31, 176 UNS\_LOADQ\_AC\_DISP (operation) 177 UNS\_LOADQ\_AC\_INDEX (operation) 177 UNS\_LOADQ\_DISP (operation) 177 UNS\_LOADQ\_INDEX (operation) 177 UNS\_LOADQ (operation) 31, 137, 177 UNS\_NEAR\_TEST (operation) 171 UNS\_NEAR (operation) 171 UNS\_RECIP\_SHIFT\_TEST (operation) 178 UNS\_RECIP\_SHIFT (operation) 178 UNS\_ROUND\_TEST (operation) 171 UNS\_ROUND (operation) 171

UNS\_SUB\_CARRY\_TEST (operation) 172
user\_trap\_disable (field in StreamStatusWord)
12
user trap 12

W

word 9, 10, 23, 24, 25, 30, 33, 36, 53, 55, 60, 62, 70, 81, 114, 121, 133, 134, 139, 144, 146, 148, 150, 155, 156, 157, 158, 162, 194, 195, 198, 199, 201, 203, 204, 205, 207, 208, 212, 213, 214, 222, 223, 224, 226, 231, 233

Z

zero divide 46

UNS\_SHIFT\_RIGHT\_TEST (operation) 179

UNS\_SHIFT\_RIGHT (operation) 179